Some URL has a slash at the end, for example, /iamteacher.php/
This is not a good news for pattern matching. So I remove all of them:
update tracking_page_hits set target_id = substring(target_id from '(.*[^/]+)/+$') where target_id ~ '.+/$';
Be careful with regex. Does it remove all slashes at the end, or just the last one?
Friday, May 22, 2009
Monday, May 18, 2009
Recommend two social network analysis software
Free ...
Simple ...
CFinder -- very easy to use. Help to find communities, and cliques. But the new version doesn't show edge weight, though it takes weight into account. Also, the directed graph looks weird. If there are A -> B, and B->A, it only shows one direction, not both.
Visone -- good for visualization. But no help file. Cannot imagine a software without help manual.
Simple ...
CFinder -- very easy to use. Help to find communities, and cliques. But the new version doesn't show edge weight, though it takes weight into account. Also, the directed graph looks weird. If there are A -> B, and B->A, it only shows one direction, not both.
Visone -- good for visualization. But no help file. Cannot imagine a software without help manual.
Sunday, May 3, 2009
social network analysis - data generation
We are going to use social network analysis to analyze IA teachers' communities. Two types of communities: 1) view projects 2) copy projects.
For community 1) view projects
IA has a tracking projects hits table since 2006-08-22 12:46:49.14858. So, the teacher pool will be:
- registered teacher who has logged in >= 2006-08-22 12:46:49.14858
- registered teacher who has public projects
- have viewed or been viewed by others
1) I am going to create a "view" of this teacher pool. It will be a sql view, but I want to do it in my Java program, so that everything I run it, a new view can be generated, (and dropped after use).
2) I need to remove IA team members from this pool.
3) There is a concern that some projects has "inflated" number of views just due to their positions in IA. IA showcase presents the most frequently visited projects and the projects with the most resources. Those projects remain there "forever", and have been visited more than a thousand times. But when I check the tracking_page_hits table, the "inflated" number are just caused by unregistered users. Since we only analyze a teacher pool (registered users), this should be fine.
For community 2) copy projects. Teacher pool will be:
- have copied or been copied by another teacher after '2008-09-07' (the time when copy project function was launched)
For community 1) view projects
IA has a tracking projects hits table since 2006-08-22 12:46:49.14858. So, the teacher pool will be:
- registered teacher who has logged in >= 2006-08-22 12:46:49.14858
- registered teacher who has public projects
- have viewed or been viewed by others
1) I am going to create a "view" of this teacher pool. It will be a sql view, but I want to do it in my Java program, so that everything I run it, a new view can be generated, (and dropped after use).
2) I need to remove IA team members from this pool.
3) There is a concern that some projects has "inflated" number of views just due to their positions in IA. IA showcase presents the most frequently visited projects and the projects with the most resources. Those projects remain there "forever", and have been visited more than a thousand times. But when I check the tracking_page_hits table, the "inflated" number are just caused by unregistered users. Since we only analyze a teacher pool (registered users), this should be fine.
For community 2) copy projects. Teacher pool will be:
- have copied or been copied by another teacher after '2008-09-07' (the time when copy project function was launched)
Subscribe to:
Posts (Atom)