Last active
March 21, 2022 13:54
-
-
Save aseemk/8049714 to your computer and use it in GitHub Desktop.
Revisions
-
aseemk revised this gist
Feb 9, 2014 . 1 changed file with 1 addition and 1 deletion.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -10,7 +10,7 @@ RETURN user, numFollowers, ROUND(totalWeighted * 100 / numFollowers) / 100.0 AS avgFollowerWeight ``` The idea is to give more weight to followers who follow fewer *other* people. So for each follower, their "weighted" contribution is `1 / numFollowing`. To prevent division by zero, the original user is included in the "following". The result is that these "normalized" follower counts are always less than users' "simple" follower counts -- but they still make sense relative to each other. Nice! One missing feature of the current query is that it ignores users w/ no followers. We can include them by changing the first relationship to an optional match, too, but then we get back to divison by zero. We could always add 1 to numFollowing, but then that gives users with no followers a weighted follower count of 1, and users with actual followers a potentially lower weighted follower count. I haven't figured out how to solve this yet. -
aseemk revised this gist
Dec 20, 2013 . 1 changed file with 1 addition and 1 deletion.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -10,7 +10,7 @@ RETURN user, numFollowers, ROUND(totalWeighted * 100 / numFollowers) / 100.0 AS avgFollowerWeight ``` The idea is to weigh followers more who follow fewer *other* people. So for each follower, their "weighted" contribution is `1 / numFollowing`. To prevent division by zero, the original user is included in the "following". The result is that these "normalized" follower counts are always less than users' "simple" follower counts -- but they still make sense relative to each other. Nice! One missing feature of the current query is that it ignores users w/ no followers. We can include them by changing the first relationship to an optional match, too, but then we get back to divison by zero. We could always add 1 to numFollowing, but then that gives users with no followers a weighted follower count of 1, and users with actual followers a potentially lower weighted follower count. I haven't figured out how to solve this yet. -
aseemk revised this gist
Dec 20, 2013 . 1 changed file with 40 additions and 50 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -3,63 +3,53 @@ This is a Neo4j 1.9 (pre-2.0) query: ``` START user=node:nodes(type='user') MATCH user <-[:follows]- follower -[?:follows]-> other WITH user, follower, 1.0 / COUNT(other) AS weighted WITH user, COUNT(follower) AS numFollowers, SUM(weighted) as totalWeighted RETURN user, numFollowers, ROUND(totalWeighted * 100) / 100.0 AS totalWeighted, ROUND(totalWeighted * 100 / numFollowers) / 100.0 AS avgFollowerWeight ``` The idea is to weigh followers more who follow fewer *other* people. So for each follower, their "weighted" contribution is `1 / numFollowing`. To prevent division by zero, the user is included in the "following". The result is that these "normalized" follower counts are always less than users' "simple" follower counts -- but they still make sense relative to each other. Nice! One missing feature of the current query is that it ignores users w/ no followers. We can include them by changing the first relationship to an optional match, too, but then we get back to divison by zero. We could always add 1 to numFollowing, but then that gives users with no followers a weighted follower count of 1, and users with actual followers a potentially lower weighted follower count. I haven't figured out how to solve this yet. Run against [The Thingdom](http://www.thethingdom.com/)'s database... Top 10 users by their "simple" follower count: ``` ==> +----------------------------------------------------------------------------------------------+ ==> | ID(user) | user.firstName | user.lastName | numFollowers | totalWeighted | avgFollowerWeight | ==> +----------------------------------------------------------------------------------------------+ ==> | 2 | "Aseem" | "Kishore" | 139 | 54.7 | 0.39 | ==> | 1 | "Daniel" | "Gasienica" | 72 | 19.06 | 0.26 | ==> | 39 | "Jenny" | "Liu" | 40 | 7.46 | 0.19 | ==> | 197 | "Ian" | "Gilman" | 22 | 3.41 | 0.15 | ==> | 4648 | "The Thingdom" | "" | 20 | 3.32 | 0.17 | ==> | 317 | "James" | "Darpinian" | 19 | 2.1 | 0.11 | ==> | 443 | "Ben" | "Vanik" | 19 | 2.1 | 0.11 | ==> | 125 | "Shelley" | "Gu" | 19 | 2.4 | 0.13 | ==> | 6072 | "Jeremy" | "Smith" | 18 | 6.94 | 0.39 | ==> | 12186 | "Brad" | "Feld" | 18 | 5.79 | 0.32 | ==> +----------------------------------------------------------------------------------------------+ ``` Top 10 users by their "weighted" follower count: ``` ==> +----------------------------------------------------------------------------------------------+ ==> | ID(user) | user.firstName | user.lastName | numFollowers | totalWeighted | avgFollowerWeight | ==> +----------------------------------------------------------------------------------------------+ ==> | 2 | "Aseem" | "Kishore" | 139 | 54.7 | 0.39 | ==> | 1 | "Daniel" | "Gasienica" | 72 | 19.06 | 0.26 | ==> | 39 | "Jenny" | "Liu" | 40 | 7.46 | 0.19 | ==> | 6072 | "Jeremy" | "Smith" | 18 | 6.94 | 0.39 | ==> | 12186 | "Brad" | "Feld" | 18 | 5.79 | 0.32 | ==> | 197 | "Ian" | "Gilman" | 22 | 3.41 | 0.15 | ==> | 4648 | "The Thingdom" | "" | 20 | 3.32 | 0.17 | ==> | 200 | "Frida" | "Kumar" | 8 | 3.23 | 0.4 | ==> | 3451 | "Anh Huy" | "Truong" | 7 | 2.68 | 0.38 | ==> | 11855 | "Andrew" | "Dorne" | 8 | 2.46 | 0.31 | ==> +----------------------------------------------------------------------------------------------+ ``` -
aseemk revised this gist
Dec 20, 2013 . 1 changed file with 1 addition and 2 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -4,8 +4,7 @@ This is a Neo4j 1.9 (pre-2.0) query: START user=node:nodes(type='user') MATCH user <-[:follows]- follower -[?:follows]-> other WITH user, follower, 1.0 / (0.0 + COUNT(other)) AS weighted WITH user, COUNT(follower) AS numFollowers, SUM(weighted) as totalWeighted RETURN user, numFollowers, ROUND(totalWeighted * 100) / 100.0 AS totalWeighted ORDER BY ID(user) ``` -
aseemk renamed this gist
Dec 20, 2013 . 1 changed file with 0 additions and 0 deletions.There are no files selected for viewing
File renamed without changes. -
aseemk revised this gist
Dec 20, 2013 . 1 changed file with 2 additions and 12 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -10,19 +10,9 @@ RETURN user, numFollowers, ROUND(totalWeighted * 100) / 100.0 AS totalWeighted ORDER BY ID(user) ``` The idea is to weigh followers more who follow fewer *other* people. So for each follower, their "weighted" contribution is `1 / numFollowing`. To prevent division by zero, the user is included in the "following". The result is that these "normalized" follower counts are always less than users' "simple" follower counts -- but they still make sense relative to each other. Nice! One missing feature of the current query is that it ignores users w/ no followers. We can include them by changing the first relationship to an optional match, too, but then we get back to divison by zero. We could always add 1 to numFollowing, but then that gives users with no followers a weighted follower count of 1, and users with actual followers a potentially lower weighted follower count. I haven't figured out how to solve this yet. Run against the current http://node-neo4j-template.herokuapp.com/ database: -
aseemk renamed this gist
Dec 20, 2013 . 1 changed file with 5 additions and 1 deletion.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -1,15 +1,17 @@ This is a Neo4j 1.9 (pre-2.0) query: ``` START user=node:nodes(type='user') MATCH user <-[:follows]- follower -[?:follows]-> other WITH user, follower, 1.0 / (0.0 + COUNT(other)) AS weighted WITH user, COUNT(follower) AS numFollowers, REDUCE(total = 0, w IN COLLECT(weighted) : total + w) AS totalWeighted RETURN user, numFollowers, ROUND(totalWeighted * 100) / 100.0 AS totalWeighted ORDER BY ID(user) ``` The idea is to weigh followers more who follow fewer *other* people. So for each follower, their "weighted" contribution is `1 / numFollowing`. To prevent division by zero, the user is included in the "following". The result is that this "normalized" follower count is always less than your @@ -24,6 +26,7 @@ Haven't figured out how to solve this yet. Run against the current http://node-neo4j-template.herokuapp.com/ database: ``` ==> +-----------------------------------------------------------------+ ==> | user | numFollowers | totalWeighted | ==> +-----------------------------------------------------------------+ @@ -70,3 +73,4 @@ Run against the current http://node-neo4j-template.herokuapp.com/ database: ==> | Node[93]{name:"tasinet"} | 1 | 0.5 | ==> | Node[94]{name:"Щы"} | 1 | 0.17 | ==> +-----------------------------------------------------------------+ ``` -
aseemk revised this gist
Dec 20, 2013 . 1 changed file with 1 addition and 1 deletion.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -28,7 +28,7 @@ Run against the current http://node-neo4j-template.herokuapp.com/ database: ==> | user | numFollowers | totalWeighted | ==> +-----------------------------------------------------------------+ ==> | Node[3]{name:"ii"} | 4 | 1.67 | ==> | Node[8]{name:"百度"} | 6 | 2.62 | ==> | Node[13]{name:"testme"} | 2 | 0.83 | ==> | Node[15]{name:"Jgl"} | 12 | 5.73 | ==> | Node[16]{name:"xad2"} | 16 | 6.23 | -
aseemk revised this gist
Dec 20, 2013 . 1 changed file with 1 addition and 1 deletion.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -28,7 +28,7 @@ Run against the current http://node-neo4j-template.herokuapp.com/ database: ==> | user | numFollowers | totalWeighted | ==> +-----------------------------------------------------------------+ ==> | Node[3]{name:"ii"} | 4 | 1.67 | ==> | Node[8]{name:"百度"} | 6 | 2.62 | ==> | Node[13]{name:"testme"} | 2 | 0.83 | ==> | Node[15]{name:"Jgl"} | 12 | 5.73 | ==> | Node[16]{name:"xad2"} | 16 | 6.23 | -
aseemk revised this gist
Dec 20, 2013 . 1 changed file with 49 additions and 49 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -2,10 +2,10 @@ This is a Neo4j 1.9 (pre-2.0) query: START user=node:nodes(type='user') MATCH user <-[:follows]- follower -[?:follows]-> other WITH user, follower, 1.0 / (0.0 + COUNT(other)) AS weighted WITH user, COUNT(follower) AS numFollowers, REDUCE(total = 0, w IN COLLECT(weighted) : total + w) AS totalWeighted RETURN user, numFollowers, ROUND(totalWeighted * 100) / 100.0 AS totalWeighted ORDER BY ID(user) The idea is to weigh followers more who follow fewer *other* people. @@ -24,49 +24,49 @@ Haven't figured out how to solve this yet. Run against the current http://node-neo4j-template.herokuapp.com/ database: ==> +-----------------------------------------------------------------+ ==> | user | numFollowers | totalWeighted | ==> +-----------------------------------------------------------------+ ==> | Node[3]{name:"ii"} | 4 | 1.67 | ==> | Node[8]{name:"百度"} | 6 | 2.62 | ==> | Node[13]{name:"testme"} | 2 | 0.83 | ==> | Node[15]{name:"Jgl"} | 12 | 5.73 | ==> | Node[16]{name:"xad2"} | 16 | 6.23 | ==> | Node[17]{name:"akbar"} | 6 | 3.67 | ==> | Node[19]{name:"lalita"} | 1 | 0.33 | ==> | Node[20]{name:"nice-app-thanks"} | 1 | 0.25 | ==> | Node[21]{name:"wowsy"} | 8 | 4.31 | ==> | Node[23]{name:"jhgfjhgf"} | 10 | 5.87 | ==> | Node[24]{name:"suroor"} | 1 | 0.33 | ==> | Node[28]{name:"john2"} | 1 | 1.0 | ==> | Node[31]{name:"pruebajuandd18"} | 2 | 0.67 | ==> | Node[32]{name:"manish"} | 2 | 0.83 | ==> | Node[33]{name:"dibz"} | 2 | 0.5 | ==> | Node[34]{name:"demian"} | 1 | 0.5 | ==> | Node[35]{name:"kumar"} | 6 | 2.84 | ==> | Node[36]{name:"oliverjash"} | 1 | 0.25 | ==> | Node[38]{name:"Mom"} | 1 | 0.33 | ==> | Node[43]{name:" pete"} | 1 | 0.5 | ==> | Node[45]{name:"sarah"} | 2 | 0.39 | ==> | Node[46]{name:"diman1"} | 1 | 0.33 | ==> | Node[48]{name:"tilli"} | 9 | 3.78 | ==> | Node[52]{name:"HLASJQ"} | 2 | 1.5 | ==> | Node[53]{name:"Momo"} | 8 | 4.0 | ==> | Node[54]{name:"pepperone233"} | 2 | 0.45 | ==> | Node[57]{name:"ElSebita"} | 7 | 3.68 | ==> | Node[58]{name:"rahul"} | 5 | 1.28 | ==> | Node[59]{name:"blup"} | 1 | 1.0 | ==> | Node[60]{name:"demian2"} | 13 | 7.14 | ==> | Node[62]{name:"Purnendu"} | 3 | 1.08 | ==> | Node[63]{name:"test"} | 8 | 4.33 | ==> | Node[67]{name:"woowee"} | 4 | 2.39 | ==> | Node[68]{name:"jimz"} | 1 | 1.0 | ==> | Node[73]{name:"ryan"} | 1 | 0.25 | ==> | Node[75]{name:"träte2"} | 2 | 0.58 | ==> | Node[76]{name:"ohMy"} | 1 | 0.5 | ==> | Node[82]{name:"alex"} | 1 | 0.25 | ==> | Node[85]{name:"fremNOKdegroBO"} | 4 | 1.08 | ==> | Node[90]{name:"teest"} | 6 | 2.03 | ==> | Node[93]{name:"tasinet"} | 1 | 0.5 | ==> | Node[94]{name:"Щы"} | 1 | 0.17 | ==> +-----------------------------------------------------------------+ -
aseemk revised this gist
Dec 20, 2013 . 1 changed file with 1 addition and 1 deletion.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -28,7 +28,7 @@ Run against the current http://node-neo4j-template.herokuapp.com/ database: ==> | user | numFollowers | ROUND(totalWeighted) | totalWeighted | ==> +----------------------------------------------------------------------------------------------+ ==> | Node[3]{name:"ii"} | 4 | 2 | 1.6666666666666665 | ==> | Node[8]{name:"百度"} | 6 | 3 | 2.6166666666666667 | ==> | Node[13]{name:"testme"} | 2 | 1 | 0.8333333333333333 | ==> | Node[15]{name:"Jgl"} | 12 | 6 | 5.726190476190476 | ==> | Node[16]{name:"xad2"} | 16 | 6 | 6.233333333333334 | -
aseemk created this gist
Dec 20, 2013 .There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -0,0 +1,72 @@ This is a Neo4j 1.9 (pre-2.0) query: START user=node:nodes(type='user') MATCH user <-[:follows]- follower -[?:follows]-> other WITH user, follower, 1.0 / COUNT(other) AS weighted WITH user, COUNT(follower) AS numFollowers, REDUCE(total = 0, w IN COLLECT(weighted) : total + w) AS totalWeighted RETURN user, numFollowers, ROUND(totalWeighted), totalWeighted ORDER BY ID(user) The idea is to weigh followers more who follow fewer *other* people. So for each follower, their "weighted" contribution is 1 / numFollowing. To prevent division by zero, the user is included in the "following". The result is that this "normalized" follower count is always less than your "simple" follower count -- but roughly in the same ballpark! Nice. One missing feature of the current query is that it ignores users w/ no followers. We can include them by changing the first relationship to an optional match, too, but then we get back to divison by zero. We could always add 1 to numFollowing, but then that gives users with no followers a weighted follower count of 1, and users with actual followers a potentially lower weighted follower count. Haven't figured out how to solve this yet. Run against the current http://node-neo4j-template.herokuapp.com/ database: ==> +----------------------------------------------------------------------------------------------+ ==> | user | numFollowers | ROUND(totalWeighted) | totalWeighted | ==> +----------------------------------------------------------------------------------------------+ ==> | Node[3]{name:"ii"} | 4 | 2 | 1.6666666666666665 | ==> | Node[8]{name:"百度"} | 6 | 3 | 2.6166666666666667 | ==> | Node[13]{name:"testme"} | 2 | 1 | 0.8333333333333333 | ==> | Node[15]{name:"Jgl"} | 12 | 6 | 5.726190476190476 | ==> | Node[16]{name:"xad2"} | 16 | 6 | 6.233333333333334 | ==> | Node[17]{name:"akbar"} | 6 | 4 | 3.6666666666666665 | ==> | Node[19]{name:"lalita"} | 1 | 0 | 0.3333333333333333 | ==> | Node[20]{name:"nice-app-thanks"} | 1 | 0 | 0.25 | ==> | Node[21]{name:"wowsy"} | 8 | 4 | 4.30952380952381 | ==> | Node[23]{name:"jhgfjhgf"} | 10 | 6 | 5.866666666666666 | ==> | Node[24]{name:"suroor"} | 1 | 0 | 0.3333333333333333 | ==> | Node[28]{name:"john2"} | 1 | 1 | 1.0 | ==> | Node[31]{name:"pruebajuandd18"} | 2 | 1 | 0.6666666666666666 | ==> | Node[32]{name:"manish"} | 2 | 1 | 0.8333333333333333 | ==> | Node[33]{name:"dibz"} | 2 | 1 | 0.5 | ==> | Node[34]{name:"demian"} | 1 | 1 | 0.5 | ==> | Node[35]{name:"kumar"} | 6 | 3 | 2.842857142857143 | ==> | Node[36]{name:"oliverjash"} | 1 | 0 | 0.25 | ==> | Node[38]{name:"Mom"} | 1 | 0 | 0.3333333333333333 | ==> | Node[43]{name:" pete"} | 1 | 1 | 0.5 | ==> | Node[45]{name:"sarah"} | 2 | 0 | 0.39285714285714285 | ==> | Node[46]{name:"diman1"} | 1 | 0 | 0.3333333333333333 | ==> | Node[48]{name:"tilli"} | 9 | 4 | 3.783333333333333 | ==> | Node[52]{name:"HLASJQ"} | 2 | 2 | 1.5 | ==> | Node[53]{name:"Momo"} | 8 | 4 | 4.0 | ==> | Node[54]{name:"pepperone233"} | 2 | 0 | 0.45 | ==> | Node[57]{name:"ElSebita"} | 7 | 4 | 3.6761904761904765 | ==> | Node[58]{name:"rahul"} | 5 | 1 | 1.2833333333333334 | ==> | Node[59]{name:"blup"} | 1 | 1 | 1.0 | ==> | Node[60]{name:"demian2"} | 13 | 7 | 7.142857142857142 | ==> | Node[62]{name:"Purnendu"} | 3 | 1 | 1.0833333333333333 | ==> | Node[63]{name:"test"} | 8 | 4 | 4.333333333333333 | ==> | Node[67]{name:"woowee"} | 4 | 2 | 2.392857142857143 | ==> | Node[68]{name:"jimz"} | 1 | 1 | 1.0 | ==> | Node[73]{name:"ryan"} | 1 | 0 | 0.25 | ==> | Node[75]{name:"träte2"} | 2 | 1 | 0.5833333333333333 | ==> | Node[76]{name:"ohMy"} | 1 | 1 | 0.5 | ==> | Node[82]{name:"alex"} | 1 | 0 | 0.25 | ==> | Node[85]{name:"fremNOKdegroBO"} | 4 | 1 | 1.0833333333333333 | ==> | Node[90]{name:"teest"} | 6 | 2 | 2.033333333333333 | ==> | Node[93]{name:"tasinet"} | 1 | 1 | 0.5 | ==> | Node[94]{name:"Щы"} | 1 | 0 | 0.16666666666666666 | ==> +----------------------------------------------------------------------------------------------+