이번 챕터에서는 웹 로그 마이닝을 통해서 어떠한 모델을 설계할 것인지에 대한 내용을 담고 있습니다. 즉, 많은 사람들이 선택했던 보았던 컨텐트들을 향후에 나타나는 유사한 행태를 가진 사람들에게 추천을 해주는 것입니다.
Data modeling for web usage mining
- Weighted user transactions (for each pageview)
- weghts may be based on user ratings of items in collaborative filtering application
- Using mean time for last page duration time lost
- Use a normalized value of page duration instead of raw time duration
User-pageview matrix
이용자가 어떤 페이지를 접속하여 보았나를 표현하는 매트릭스
Ordering of pageviews in a transaction is not relevant,
We can represent user-pageview matrix (or transaction matrix), UPM
- Clustering of transactions can lead to discovery of important user or visitor segments
- Item clustering and Association items or sequential pattern mining can find important relationships among items based on the navigational patterns of users in the site
Pageview-feature matrix
접속한 페이지에서 특질을 추출해 내는 과정. 페이지를 단순히 페이지로 보지 않고, 분류 또는 벡터화 하는 과정
- Semantic information (from the content of web pages)
- Extracting features like words or concepts from the site in a global dictionary
- We can generate pageview-feature matrix, PFM
Transaction feture matrix
- Transforming user transactions into content-enhanced transactions containing the semantic features
- Multiplication of the UPM with PFM is Transaction Feture Matrix, TFM
- Web sites represented as a term-pageview matrix
- Content-enhanced transaction matrix may reveal segments of users that have common interests in different concepts as indicated from their navigational behaviors
We could apply association rules such as: { "British", "Romance", "Comedy" => "Hugh Grant" }
'강좌 > web data mining' 카테고리의 다른 글
WDM - Chapter 12. Web Usage Mining (4) (0) | 2008.04.20 |
---|---|
WDM - Chapter 12. Web Usage Mining (2) (0) | 2008.04.14 |
WDM - Chapter 12. Web Usage Mining (1) (0) | 2008.04.14 |