Last active
January 3, 2020 21:35
-
-
Save klintan/7bb82fce7e38db017586c817a31c0cb0 to your computer and use it in GitHub Desktop.
Revisions
-
klintan revised this gist
Jan 3, 2020 . 1 changed file with 12 additions and 12 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -6,15 +6,15 @@ def compute_source_trust(data, sources): :param sources: dict all unique sources and current scores :return: dict of unique sources with updated scores ''' for source in sources: # t(w) trustworthiness of website w # the average confidence of all facts supplied by website/source w t_w = sum([confidence for confidence in data[data['source'] == source]['confidence'].values]) / len( data[data['source'] == source].index) # tau(w) trustworthiness score of website w # as explained in the paper, 1 - t(w) is usually quite small and multiplying many of them # might lead to underflow. Therefore we take the logarithm of it to better model how trustworthy a source is tau_w = -np.log(1 - t_w) # update the source score to the new score sources[source] = tau_w return sources -
klintan created this gist
Sep 7, 2018 .There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -0,0 +1,20 @@ def compute_source_trust(data, sources): ''' Compute every source trustworthiness. The trustworthiness score is the average confidence of all facts supplied by source w :param data: Dataframe all facts for object O :param sources: dict all unique sources and current scores :return: dict of unique sources with updated scores ''' for source in sources: # t(w) trustworthiness of website w # the average confidence of all facts supplied by website/source w t_w = sum([confidence for confidence in data[data['source'] == source]['confidence'].values]) / len( data[data['source'] == source].index) # tau(w) trustworthiness score of website w # as explained in the paper, 1 - t(w) is usually quite small and multiplying many of them # might lead to underflow. Therefore we take the logarithm of it to better model how trustworthy a source is tau_w = -np.log(1 - t_w) # update the source score to the new score sources[source] = tau_w return sources