<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
    <author>
    <name>Domain Administrator</name>
    <uri>http://www.freebase.com/view/user/domain_administrator</uri>
  </author>
    <generator uri="http://www.freebase.com/">Freebase Atom Feed Generator</generator>
    <id>http://www.freebase.com/view/dataworld</id>
    <link rel="self" href="http://www.freebase.com/feed/discuss/all/dataworld"/>
    <title>Data World</title>
    <updated>2008-10-06T13:55:11Z</updated>
  <entry>
    <author>
    <name>cheunger</name>
    <uri>http://www.freebase.com/view/user/cheunger</uri>
  </author>
    <content type="html">Thanks for the information - I've passed this on to our data team to check it out!</content>
    <id>http://www.freebase.com/view/guid/9202a8c04000641f8000000009020036</id>
    <link rel="alternate" type="text/html" href="http://www.freebase.com/view/guid/9202a8c04000641f8000000009020036" title="Medpedia: a new source for medical information"/>
    <summary type="html">Thanks for the information - I've passed this on to our data team to check it out! </summary>
    <title>Medpedia: a new source for medical information</title>
    <updated>2008-08-26T21:53:32.0012Z</updated>
  </entry><entry>
    <author>
    <name>d_leaper</name>
    <uri>http://www.freebase.com/view/user/d_leaper</uri>
  </author>
    <content type="html">&lt;p&gt;which suppose to be both free, and accurate.&lt;/p&gt;&lt;p&gt;please consider this for a source for freebase when it launchs by the end of 2008.&lt;/p&gt;</content>
    <id>http://www.freebase.com/view/guid/9202a8c04000641f800000000901fafd</id>
    <link rel="alternate" type="text/html" href="http://www.freebase.com/view/guid/9202a8c04000641f800000000901fafd" title="Medpedia: a new source for medical information"/>
    <summary type="html">which suppose to be both free, and accurate. please consider this for a source for freebase when it...</summary>
    <title>Medpedia: a new source for medical information</title>
    <updated>2008-08-26T20:54:15.0018Z</updated>
  </entry><entry>
    <author>
    <name>cahlberg</name>
    <uri>http://www.freebase.com/view/user/cahlberg</uri>
  </author>
    <content type="html">Is this data source updated continuously or was it a one time upload?</content>
    <id>http://www.freebase.com/view/guid/9202a8c04000641f8000000008cbdc4b</id>
    <link rel="alternate" type="text/html" href="http://www.freebase.com/view/guid/9202a8c04000641f8000000008cbdc4b" title="SEC Board Members, Officers: updates?"/>
    <summary type="html">Is this data source updated continuously or was it a one time upload? </summary>
    <title>SEC Board Members, Officers: updates?</title>
    <updated>2008-07-24T02:44:17.0017Z</updated>
  </entry><entry>
    <author>
    <name>thione</name>
    <uri>http://www.freebase.com/view/user/thione</uri>
  </author>
    <content type="html">&lt;p&gt;Merge these pages... the three Powerset - Powerset, Inc. and PowerSet should be merged. They refer to the same entity&lt;/p&gt;</content>
    <id>http://www.freebase.com/view/guid/9202a8c04000641f80000000083120bb</id>
    <link rel="alternate" type="text/html" href="http://www.freebase.com/view/guid/9202a8c04000641f80000000083120bb" title="Powerset: Merging these pages"/>
    <summary type="html">Merge these pages... the three Powerset - Powerset, Inc. and PowerSet should be merged. They refer...</summary>
    <title>Powerset: Merging these pages</title>
    <updated>2008-05-15T04:40:53.0012Z</updated>
  </entry><entry>
    <author>
    <name>will</name>
    <uri>http://www.freebase.com/view/user/will</uri>
  </author>
    <content type="html">&lt;p&gt;way to go Al! you must've used the data from the in-house mirror since i don't see a newer enwiki data dump from wikipedia. glad it worked.&lt;/p&gt;</content>
    <id>http://www.freebase.com/view/guid/9202a8c04000641f8000000005f22078</id>
    <link rel="alternate" type="text/html" href="http://www.freebase.com/view/guid/9202a8c04000641f8000000005f22078" title="periodic wikipedia topic load: periodic update"/>
    <summary type="html">way to go Al! you must've used the data from the in-house mirror since i don't see a newer enwiki...</summary>
    <title>periodic wikipedia topic load: periodic update</title>
    <updated>2007-10-18T16:13:33.0012Z</updated>
  </entry><entry>
    <author>
    <name>dsp13</name>
    <uri>http://www.freebase.com/view/user/dsp13</uri>
  </author>
    <content type="html">&lt;p&gt;Yes, inlinks are a simple and useful measure. Another which has received attention is number of edits. e.g. Wilkinson &amp; Huberman, &lt;a href="http://www.firstmonday.org/issues/issue12_4/wilkinson/"&gt;Assessing the value of cooperation in Wikipedia&lt;/a&gt;. Because Wikipedia is large and well-documented, and there's general interest in large graphs and the way they scale etc., there have been loads of measures made of it for various purposes - e.g. see Voss, &lt;a href="http://eprints.rclis.org/archive/00003610/01/MeasuringWikipedia2005.pdf"&gt;Measuring Wikipedia&lt;/a&gt;, Buriol et al, &lt;a href="http://www.dcc.uchile.cl/~ccastill/papers/buriol_2006_temporal_analysis_wikigraph.pdf"&gt;Temporal Analysis of the Wikigraph&lt;/a&gt;, Chernov et al,&lt;a href="http://www.l3s.de/~chernov/SEMWIKI2006.pdf"&gt; Extracting Semantic Relationships between Wikipedia Categories&lt;/a&gt;, Ollivier et al, &lt;a href=" http://pierre.senellart.com/publications/ollivier2006finding.pdf"&gt;Finding Related Pages Using Green Measures: The Example of Wikipedia&lt;/a&gt;. (But this is no doubt telling my grandmother to suck eggs, since it's apparently not just that Microsoft is interested in leveraging wikipedia for &lt;a href="http://research.microsoft.com/users/silviu/Papers/emnlp07.pdf"&gt; named entity disambiguation&lt;/a&gt;, but also that you guys have some connection with Powerset!)&lt;/p&gt;</content>
    <id>http://www.freebase.com/view/guid/9202a8c04000641f8000000005f0fbfd</id>
    <link rel="alternate" type="text/html" href="http://www.freebase.com/view/guid/9202a8c04000641f8000000005f0fbfd" title="Wikipedia: wikipedia article length etc. as metric of importance of topic"/>
    <summary type="html">Yes, inlinks are a simple and useful measure. Another which has received attention is number of...</summary>
    <title>Wikipedia: wikipedia article length etc. as metric of importance of topic</title>
    <updated>2007-10-09T20:53:58.0000Z</updated>
  </entry><entry>
    <author>
    <name>faye</name>
    <uri>http://www.freebase.com/view/user/faye</uri>
  </author>
    <content type="html">&lt;p&gt;Hi there. Interesting idea (and thanks for sharing). I'd like to point out that Wikipedia has standards that require long articles to be split into multiple shorter articles. So at least for the more edited and popular topics, the length of articles all approach the same over time. On the other hand, each time an article is split out, a link is added to the main article to it, so the number of links would be a better metric of article "importance",  if such a thing can be considered in such absolute terms.&lt;/p&gt;</content>
    <id>http://www.freebase.com/view/guid/9202a8c04000641f8000000005f0fb41</id>
    <link rel="alternate" type="text/html" href="http://www.freebase.com/view/guid/9202a8c04000641f8000000005f0fb41" title="Wikipedia: wikipedia article length etc. as metric of importance of topic"/>
    <summary type="html">Hi there. Interesting idea (and thanks for sharing). I'd like to point out that Wikipedia has...</summary>
    <title>Wikipedia: wikipedia article length etc. as metric of importance of topic</title>
    <updated>2007-10-09T19:24:12.0007Z</updated>
  </entry><entry>
    <author>
    <name>dsp13</name>
    <uri>http://www.freebase.com/view/user/dsp13</uri>
  </author>
    <content type="html">&lt;p&gt;Sure, using these things to do autocomplete/predictive disambiguation requires dynamic calculations with local context (Though they usually of course work with some static global measure as their background.) But global measures of 'overall importance' like length of a wikipedia page, or PageRank for a web page, do tell you something informative for some purposes. E.g., it would be interesting to be able to compare freebase coverage to that of wikipedia: which pages within a given wikipedia category or freebase type have quite a lot said about them in wikipedia but not much about them in freebase? which of two wikipedia categories has better coverage in freebase? Again, if an app goes to freebase to retrieve information which is visually displayed it need not use any importance metric to select the information it wants but could use the metric in the display (size of text in a tag cloud, size of circle in a map) etc.&lt;/p&gt;</content>
    <id>http://www.freebase.com/view/guid/9202a8c04000641f8000000005da55e7</id>
    <link rel="alternate" type="text/html" href="http://www.freebase.com/view/guid/9202a8c04000641f8000000005da55e7" title="Wikipedia: wikipedia article length etc. as metric of importance of topic"/>
    <summary type="html">Sure, using these things to do autocomplete/predictive disambiguation requires dynamic calculations...</summary>
    <title>Wikipedia: wikipedia article length etc. as metric of importance of topic</title>
    <updated>2007-09-28T14:32:01.0000Z</updated>
  </entry><entry>
    <author>
    <name>robert</name>
    <uri>http://www.freebase.com/view/user/robert</uri>
  </author>
    <content type="html">&lt;p&gt;These came from wikipedia categories like "alumni of Cambridge University".  I did it as an experiment to see if we could cover the more major universities.  It's something we'll turn into a more automated pipeline soon, so that we can keep synchronized with wikipedia.&lt;br /&gt;
&lt;br /&gt;
The hope is that if we at least get the institutions in, people will fill out the degrees and years over time.&lt;br /&gt;
&lt;br /&gt;
Also, if you have other sources of this information, I'd be happy to load it.  I find this kind of information to be really fun.&lt;/p&gt;</content>
    <id>http://www.freebase.com/view/guid/9202a8c04000641f8000000005d6eaef</id>
    <link rel="alternate" type="text/html" href="http://www.freebase.com/view/guid/9202a8c04000641f8000000005d6eaef" title="Alumni load -- people connected to educational institutions: Which alumni have been loaded?"/>
    <summary type="html">These came from wikipedia categories like "alumni of Cambridge University". I did it as an...</summary>
    <title>Alumni load -- people connected to educational institutions: Which alumni have been loaded?</title>
    <updated>2007-09-27T18:51:06.0007Z</updated>
  </entry><entry>
    <author>
    <name>tsturge</name>
    <uri>http://www.freebase.com/view/user/tsturge</uri>
  </author>
    <content type="html">&lt;p&gt;That's a very interesting idea. We have an infrastructure internally that knows about the interconnectedness of wikipedia articles, it's part of the system that means that autocomplete and search know that "London" is more likely to be the one in England than the one in Ontario.&lt;br /&gt;
&lt;br /&gt;
The issue with these metrics is that they are fairly dynamic and rather opaque and context sensitive. Different "George Bushes" are important if you are only interested in cricket rather than US politics for example. It's hard to get a single number which ranks "overall importance" in the same way that the web doesn't have a "most important page"; it all depends on what you are trying to do.&lt;br /&gt;
&lt;br /&gt;
So I'm curious what applications you see as using this number (or set of data) and how they would use it.&lt;br /&gt;
&lt;/p&gt;</content>
    <id>http://www.freebase.com/view/guid/9202a8c04000641f8000000005d66c80</id>
    <link rel="alternate" type="text/html" href="http://www.freebase.com/view/guid/9202a8c04000641f8000000005d66c80" title="Wikipedia: wikipedia article length etc. as metric of importance of topic"/>
    <summary type="html">That's a very interesting idea. We have an infrastructure internally that knows about the...</summary>
    <title>Wikipedia: wikipedia article length etc. as metric of importance of topic</title>
    <updated>2007-09-27T17:04:04.0000Z</updated>
  </entry><entry>
    <author>
    <name>dsp13</name>
    <uri>http://www.freebase.com/view/user/dsp13</uri>
  </author>
    <content type="html">&lt;p&gt;I'm reposting a comment I left at http://www.freebase.com/view/wikipedia, since no one seems to have noticed it there.&lt;br /&gt;
&lt;br /&gt;
Lots of topics have keys in /wikipedia/en. I wonder whether it would be worth uploading, for each topic in this namespace, a simple measure of the relative importance of this topic in wikipedia. The simplest such measure would be the length of the wikipedia article (other obvious candidates: number of inlinks, number of outlinks, number of editors - even computation of the page's pagerank in the wikipedia graph, if the matrix computation was thought worth the effort.) This might allow more important topics to be selected for review to be given more attention: ceteris paribus, the more important a wikipedia topic, the more needs to be said about it on freebase. (I don't know if you have thought about metrics to measure how much is said about a topic in freebase.)&lt;/p&gt;</content>
    <id>http://www.freebase.com/view/guid/9202a8c04000641f8000000005c94ecc</id>
    <link rel="alternate" type="text/html" href="http://www.freebase.com/view/guid/9202a8c04000641f8000000005c94ecc" title="Wikipedia: wikipedia article length etc. as metric of importance of topic"/>
    <summary type="html">I'm reposting a comment I left at http://www.freebase.com/view/wikipedia, since no one seems to...</summary>
    <title>Wikipedia: wikipedia article length etc. as metric of importance of topic</title>
    <updated>2007-09-21T17:10:20.0012Z</updated>
  </entry><entry>
    <author>
    <name>dsp13</name>
    <uri>http://www.freebase.com/view/user/dsp13</uri>
  </author>
    <content type="html">&lt;p&gt;The alumni load seems a great idea - there's a lot of nice specific data there. Casually browsing the Wikipedia category Alumni by university or college, though, I can't make out which alumni were loaded on Aug 24.&lt;/p&gt;</content>
    <id>http://www.freebase.com/view/guid/9202a8c04000641f8000000005c8d7d9</id>
    <link rel="alternate" type="text/html" href="http://www.freebase.com/view/guid/9202a8c04000641f8000000005c8d7d9" title="Alumni load -- people connected to educational institutions: Which alumni have been loaded?"/>
    <summary type="html">The alumni load seems a great idea - there's a lot of nice specific data there. Casually browsing...</summary>
    <title>Alumni load -- people connected to educational institutions: Which alumni have been loaded?</title>
    <updated>2007-09-14T23:11:45.0005Z</updated>
  </entry><entry>
    <author>
    <name>lukeschubert</name>
    <uri>http://www.freebase.com/view/user/lukeschubert</uri>
  </author>
    <content type="html">&lt;p&gt;It looks to me like this URL (http://unicode.org/iso15924/iso15924-codes.html) would be a good source for a data load for Language Writing Systems.&lt;/p&gt;</content>
    <id>http://www.freebase.com/view/guid/9202a8c04000641f8000000005b862a6</id>
    <link rel="alternate" type="text/html" href="http://www.freebase.com/view/guid/9202a8c04000641f8000000005b862a6" title="ISO 15924: Data load for Language Writing Systems?"/>
    <summary type="html">It looks to me like this URL (http://unicode.org/iso15924/iso15924-codes.html) would be a good...</summary>
    <title>ISO 15924: Data load for Language Writing Systems?</title>
    <updated>2007-08-22T03:32:53.0005Z</updated>
  </entry><entry>
    <author>
    <name>kurt</name>
    <uri>http://www.freebase.com/view/user/kurt</uri>
  </author>
    <content type="html">&lt;p&gt;This revert operation was intended to fix a duplication protein dataset load that happened around June 11.&lt;/p&gt;</content>
    <id>http://www.freebase.com/view/guid/9202a8c04000641f800000000589213e</id>
    <link rel="alternate" type="text/html" href="http://www.freebase.com/view/guid/9202a8c04000641f800000000589213e" title="User Revert Operation For /user/mwcl_infobox: Removal of duplicate mwcl_infobox load"/>
    <summary type="html">This revert operation was intended to fix a duplication protein dataset load that happened around...</summary>
    <title>User Revert Operation For /user/mwcl_infobox: Removal of duplicate mwcl_infobox load</title>
    <updated>2007-08-05T17:54:15.0011Z</updated>
  </entry><entry>
    <author>
    <name>robert</name>
    <uri>http://www.freebase.com/view/user/robert</uri>
  </author>
    <content type="html">&lt;p&gt;It looks more like you loaded 90K or less.
&lt;/p&gt;</content>
    <id>http://www.freebase.com/view/guid/9202a8c04000641f80000000051d2d6e</id>
    <link rel="alternate" type="text/html" href="http://www.freebase.com/view/guid/9202a8c04000641f80000000051d2d6e" title="Musical track length uniqueness: 800K?"/>
    <summary type="html">It looks more like you loaded 90K or less.
 </summary>
    <title>Musical track length uniqueness: 800K?</title>
    <updated>2007-06-20T07:54:16.0011Z</updated>
  </entry><entry>
    <author>
    <name>robert</name>
    <uri>http://www.freebase.com/view/user/robert</uri>
  </author>
    <content type="html">&lt;p&gt;This looked more like 8000 primitives.  We need a system to help predict how large a load is going to be.  Maybe this is the sort of estimate that the data pusher could provide.
&lt;/p&gt;</content>
    <id>http://www.freebase.com/view/guid/9202a8c04000641f8000000004f2ee55</id>
    <link rel="alternate" type="text/html" href="http://www.freebase.com/view/guid/9202a8c04000641f8000000004f2ee55" title="Hockey Roster Positions Loaded: ~8000 primitives"/>
    <summary type="html">This looked more like 8000 primitives. We need a system to help predict how large a load is going...</summary>
    <title>Hockey Roster Positions Loaded: ~8000 primitives</title>
    <updated>2007-04-23T06:09:36.0012Z</updated>
  </entry>
</feed>