Relationships

  1.  
    1. I suggest that the children property use a mediator rather than directly linking to the person. This is currently done for marriage (useful) and sibling relationships (not so useful right now, since there are no additional properties on the mediator). Children (and siblings) can be the result of a specific marriage or a specific person. There's no way of knowing where a person's children came from if they have multiple marriages.

      1. This is a good suggestion, worth looking into. Thanks.

      2. All we need are topics for the persons and the unmediated links between them. Starting with the information about Person-1, a user finds that Person-1 has a child, Person-2. The user can then discover the other recorded parents of Person-2 by examining the information about Person-2, in which information the user finds that Person-2 has a parent, Person-3 . What advantages do mediators offer in cases of this sort?

      3. I agree with Etan. People can be born into or out of wedlock, and their parents may get married at some later date. If you know who a person’s parents are and you know their birthday, you can then make inferences based on marriages of the parents, but that’s about the best you can do.

        A stronger argument for mediators is handling adoptive relationships as well as or instead of biological ones. That gets complicated.

      4. What is complicated about handling adoptive relationships?

      5. It’s a matter of what information is desired. Right now, we only have the parent/child relationship; it doesn’t say whether it’s exclusively biological or adoptive or either. That could easily result in people with multiple parents, or bad inferences, such as someone with two parents who were not married to each other until well after his birth; that could be an out-of-wedlock birth, or it could be a step-parent relationship. Genealogists, people looking for health information, and people curious about nobility would all want to know the specifics of the relationship. But on the other hand, particularly for living people, many people consider an adoptive relationship to be exactly the same as a biological one, and we need (IMO) to be sensitive to that.

      6. It seems to me that the complication is already there, regardless of whether Freebase deals with the complication. Are multiple parents currently impossible? Are bad inferences currently impossible? Are people currently devoid of curiosity regarding parent/child relationships? To what degree will the company Metaweb control Freebase in attempts to avoid offending people? To what degree will the company Metaweb control Freebase in attempts to avoid conflicting or ambiguous data? In any case, what do mediators offer that properties do not offer?

      7. In the absence of any qualifiers, people are going to assume that "parent" means biological parent.  It would definitely be useful to have adoptive relationships (with dates) included as well.  The two types of information are both useful.  As was pointed out, for some types of research (e.g. medical/genetic) only biological relationships are important.

        On the opposite side of things, the "sibling" relationship seems undesirable to have stored explicitly since it can be derived from the other primary relationships. 

      8. People may assume that the biological parent is intended, but the participants in most adoptive relationships prefer to have their relationship assumed to be equivalent to (if not identical to) a biological one. This is really a sort of privacy questions; the fact that you or I were or were not adopted is not of general public interest; the fact that Nicole Richie or Pax Jolie-Pitt are adopted is of broader public interest. This will probably be addressed in the near future by a division of the Person type into public and private sub-types.

      9. I just discovered this reply after many months. Sorry for the delay!

        In my opinion the privacy concerns are orthogonal and are a red herring here. The participants can believe whatever they want, but biological parents and adoptive parents are undeniably different relationships.

        The decision to obscure the facts which our displayed for living people who aren't public personas shouldn't prevent getting the data accurate in the first place. That's not possible with the current schema.

        Take a look a Gerry Ford http://freebase.com/view/en/gerald_ford?pid=%2Fpeople%2Fperson%2Fparents. Leaving aside the fact that his grandfather is also listed as a "parent" (apparently the work of mw_template_bot), how would you accurately model his family using the current schema? How would you tell his adoptive parents from his biological parents? How would you tell which female parent went with which male parent? Would all the half-brothers and half-sisters just get linked together in one big undifferentiated pile of "siblings?"

      10. It’s less a decision to obscure the facts as a lack of a compelling use case for a more complicated model. The current model is simple and handles 95% of the use cases. To complicate the parent-child relationship—quadrupling the amount of information needed to represent it, in even the simplest cases—really needs a compelling use case. The blurb can tell human consumers that Gerald Ford was adopted. Is there a need for API-based applications to be able to make that distinction?

      11. Another long delay - I need to remember to login in more than once a month (or figure out how to get RSS feeds set up).

        I guess the real answer to whether the accuracy is needed depends on whether Freebase has aspirations to hold genealogical data.  The current scheme isn't adequate for anything genealogical, including medical applications which need genealogical information.

        I tend to lean towards the "if you're going to do it, you should do it right" camp, but I recognize that that often leads to over engineering.  I don't know anything about how Freebase stores things internally, so it's difficult to evaluate where the "quadrupling" comes in and how big an impact it has in real world terms.

        On the other hand, no one's addressed the sibling side of the argument.  That could clearly be derived from traversing the parent-child graph as well, yet it seems to have been important enough to record separately (and redundantly).

        That's enough lobbying for me though.  If anyone ever decides to change the decision, ping me and I'll provide pointers to how things are modeled in the genealogical data world.



    Discussion is posted in:

    Think this discussion also relates to something else? Cross-post it by adding a new discussion area:

Related Discussions