"That's actually a good way of explaining reciprocal properties.
Do you know how much lag time to..."
Data Modeling Guide
This document is intended for Freebase users who would like to develop new data models by creating new types and properties.
Although the Freebase application makes it a lot easier to create and maintain types, data modeling is still a difficult and nuanced discipline. By following these guidelines, you will make your data more usable, extensible and understandable to others in the Freebase community. It will also increase your chances of having a private domain approved to become public and usable by everybody.
Also, a lot of humility is required when building models. What you think is a good solution initially will often turn out to be wrong as more diverse data is added. It is better to start simple and add depth as the data and the community of users require it.
After your first draft, test your assumptions with a small, diverse set of data to prove to yourself that your model is general enough to scale. Typically when you try to add real-world examples you will have to modify your model repeatedly in sometimes dramatic ways. Expect that your model will go through many revisions before it can be promoted to a publicly visible domain.
In the early schema-building phase when you are renaming types and properties, it is very easy to get the property and type display names out of synch with the programmatic keys that are used by applications built on your schema. (Currently there is no support in the UI for fixing the programmatic keys.)
The biggest benefit is that objects instead of literals can have reverse links that point back to the originating object. In our example above, if the "Media" property points to objects of type "Art Medium", then the object representing "Watercolor" would point to all "Artwork" objects that were painted in watercolors.
Another benefit is that typed objects autocomplete during data input. Whereas users must type in a full entry for string literals, for an object they need type in the first few characters and select a match.
Cross-typing is used to reconcile data that would normally have been in separate databases around a single object. For example, the Film and Music information for Kevin Bacon came from two different databases, each of which had its own local object representing him. When property values from both database were brought into Metaweb, they were attached to the same object.
Cross-typing also provides similar capability to object inheritance. Although Metaweb does not support inheritance directly, types can declare other types they depend on that include properties that normally would be expected. Film Actor, for instance, depends on the Person type to ensure that properties for "Date of Birth" and "Gender" exist. The Person type is depended on by many types to provide these basic properties.
Since return properties increase the connectedness of data, its a generally a good idea to create more co-types. The Metaweb UI supports this well. An object added as a property value automatically is given the expected type. For example, if an untyped topic were added to a films "Directed By" property it would be typed as a Film Director.
For example, making the "Directed by" property on the "Film" type expect "Film director" instead of type "Person" will result in these benefits:
Since non-technical users contribute to and use the data in Metaweb, the data model should be kept as straightforward as possible. Abstraction and complexity have higher costs than they would in a system built and maintained by trained ontologists or software engineers. If a type serves little practical purpose, it should be avoided.
Since you are defining the type at the schema level, you are proclaiming what how all values of that property should be expressed. If you set it to "Liters", for instance, all values of that property must be normalized to liters when they are written.
All public types should use SI (aka "Metric") units, where they can be used (almost everywhere). The Freebase application will at some point provide unit conversion in the UI for users of other measurement systems (such as Americans, Martians, etc.)
Another example of a compound value is the Marriage type, which connects two Person instances together and holds the dates the marriage began and (optionally) ended.
In the Freebase application, compound values appear differently than other objects. Normally, when viewing a topic, you see only the display name (link) of linked objects or literal values, such as the director's name or the film's release date. However, if there is a link to a compound object, values within that object are shown instead of a display name. For the Film Performance example, instead of a display name, the actor name and the character name are shown instead.
To make a type a compound value, click on the checkbox next to "Compound Value Type" at the top of the Schema Editor page,

and then be sure to check the "display as disambiguator" box in each property that you would like to show in place of a display name when viewed from other objects.

Compound Values can also be used for basic system types that are used by many types. The Money Value type is a compound value that stores the currency type, the amount and a date the amount was valid.
For example, a member of the Dining domain might want to add an idiosyncratic property on the Restaurant type that records whether or not squeeze ketchup bottles are used in the restaurant. Although this may be very important for a small set of users, it's unlikely to be filled in for most restaurants. It would better be encoded in a private co-type like "Squeeze bottle friendly restaurant" and used by the narrow community.
Without these optional properties, there would have to be three separate Political Office Held types for State, City and County. Worse, there would have to be three properties on "US Politician" pointing to each of these. Splitting these apart would make the data model more complex and make querying for a politician's history more difficult.
Data modeling is about these kinds of tradeoffs -- sometimes its better to slightly inconvenience a data contributor if it makes the data model easier to understand and query.
Although the Freebase application makes it a lot easier to create and maintain types, data modeling is still a difficult and nuanced discipline. By following these guidelines, you will make your data more usable, extensible and understandable to others in the Freebase community. It will also increase your chances of having a private domain approved to become public and usable by everybody.
Types should define collections of objects that share common properties
Although it's possible to use types to create loose collections of objects, it should be avoided. Types such as "Cooking" that includes topics on knives, vegetables and recipes are too general to be useful. Most critically, there are no properties that all of the instances are likely to have in common. In this case, it would be better to have several types, such as Knife, Vegetable and Recipe, each of which defines a concrete collection with its own set of properties. For instance, all Knives have a properties for "sharpness" and "steel composition", where vegetables have "color when ripe".Be careful to not over-develop the model
If you are an expert in a subject area, you may be inclined to develop a very rich and intricate data model. Bear in mind that your contributors may not be as sophisticated and may have a hard time understanding your intent, and you may end up with incorrect or less complete information than if you had started with a simpler model.Also, a lot of humility is required when building models. What you think is a good solution initially will often turn out to be wrong as more diverse data is added. It is better to start simple and add depth as the data and the community of users require it.
The best types define collections that are useful to many people
Types should include information that has some value to more than just one or a small group of people. A type for "Cat nose shape" or "Color blind musician" might be appropriate in a private domain that is created and maintained by a single user, but it is too idiosyncratic to be in a public Freebase domain.Types should have singular names
Since the type name is often used to describe a single object, always use the singular: "Film" instead of "Films", "Person" instead of "People". The first letter of the first word of a type should be capitalized. Each subsequent word should not be capitalized unless it would be in a sentence (such as a proper name).Use existing types where possible
When creating your data model, you should check to see if there is an existing type in another domain that might be appropriate. Some domains contain types that are used by generally throughout the system. For example, the Location domain includes the City/Town and Geolocation types and the Measurement Unit domain includes the Dated integer, Integer range and Money value types.Learn from existing data sources
Before you begin to create a new data model, learn from the structure of existing structured data sources. Very often the designers of these have thought through the more difficult modeling problems. In some cases, like Wikipedia templates and infoboxes, a large set of users have settled on a "folk schema" that best covers their interests. Take a look at the Wikipedia Template Viewer (insert link) where you can find templates that may help guide you.Add real-world example data while building your data model
Data models created in isolation are often just an intellectual exercise. Even before you begin to create the types and properties, have in mind specific examples of the type you're building.After your first draft, test your assumptions with a small, diverse set of data to prove to yourself that your model is general enough to scale. Typically when you try to add real-world examples you will have to modify your model repeatedly in sometimes dramatic ways. Expect that your model will go through many revisions before it can be promoted to a publicly visible domain.
Order schema properties by importance
The order you give to properties in the schema editor are the order that you see when instances of that property are viewed. The most important properties should be at the top to make them more visible to users.Pick property names that make sense in context
Very often, property names only make sense when the type is known. For instance, in the Film type, it is enough to for a property to say "Budget" since it is understood it is the budget of a film. However, in types that are likely to be blended with other types, you may have to describe the property more completely. For instance, in types associated with people, property names have to be more "fully qualified". Sometimes a Film Producer is also a TV Producer. In this case, the property for each type must include the type it belongs to: "Films produced" and "TV programs produced" rather than the ambiguous "Produced" for each.Make sure that programmatic keys match display names
When you define types and properties, you are defining the API that will be used by applications accessing data. Types and properties have both "display names" intended for people, and "programmatic key", intended for computers using the API. Changing a display name affects only the label of the type or property in the Freebase application. However, changing the programmatic key can break applications depending on your API. For this reason, programmatic keys can very rarely change once a domain is publicly visible and applications are running against it.In the early schema-building phase when you are renaming types and properties, it is very easy to get the property and type display names out of synch with the programmatic keys that are used by applications built on your schema. (Currently there is no support in the UI for fixing the programmatic keys.)
Use objects instead of literals where possible
Resist the temptation to use string literals instead of typed objects. For example, the "Artwork" object type has a property called "Media" with values like "acrylic paint" and "plaster". In a relational system, it might be more convenient to make these string literals. In Metaweb, however, it's just as easy to make them full objects with significant benefits.The biggest benefit is that objects instead of literals can have reverse links that point back to the originating object. In our example above, if the "Media" property points to objects of type "Art Medium", then the object representing "Watercolor" would point to all "Artwork" objects that were painted in watercolors.
Another benefit is that typed objects autocomplete during data input. Whereas users must type in a full entry for string literals, for an object they need type in the first few characters and select a match.
Use Topic properties where possible
When an object is typed "topic", it automatically gains properties that are common to all topics. These are:- Alias (also known as) - Aliases are alternate names for the topic. Objects (and topics) are allowed only a single display name, but they can have many aliases. Search and autocomplete match on aliases.
- Image - A topic can have one or more images associated with it.
- Webpage - A topic can have one or more webpages attached to it
- Article (description) - A topic can have a single article that describes it.
Co-typing is necessary and good
A novel aspect of the Metaweb system is that instances may have multiple types. A single topic such as "Kevin Bacon" may have multiple types such as a Person, Film Actor, TV Actor and Musical Artist. and others. Since no single type could encapsulate such diversity, multiple types are required to hold all properties to fully describe Kevin Bacon and his life.Cross-typing is used to reconcile data that would normally have been in separate databases around a single object. For example, the Film and Music information for Kevin Bacon came from two different databases, each of which had its own local object representing him. When property values from both database were brought into Metaweb, they were attached to the same object.
Cross-typing also provides similar capability to object inheritance. Although Metaweb does not support inheritance directly, types can declare other types they depend on that include properties that normally would be expected. Film Actor, for instance, depends on the Person type to ensure that properties for "Date of Birth" and "Gender" exist. The Person type is depended on by many types to provide these basic properties.
Create domain-specific co-types
Sometimes when you are filling in the "expected type" for a property you may not want to use a very generic type, but instead you may want to create a domain specific type. For instance, in the type Film, the director property could expect a type Person. This would work from the perspective of Film, but because Person is such a generic type, it is unlikely that it would have a return property back called "Films Directed" (if it did so, it could very well have hundreds of return properties for all sorts of obscure things.) Instead, the "Directed By" property expects a Film Director type. The Film Director type has a return property called "Films Directed" and depends on on the "Person" co-type.Since return properties increase the connectedness of data, its a generally a good idea to create more co-types. The Metaweb UI supports this well. An object added as a property value automatically is given the expected type. For example, if an untyped topic were added to a films "Directed By" property it would be typed as a Film Director.
Think twice before using "Topic" or "Person" as an expected type
Sometimes it seems simpler to use "topic" or "person" as an expected type when you are creating a property. Very often it makes more sense to create a more specific type.For example, making the "Directed by" property on the "Film" type expect "Film director" instead of type "Person" will result in these benefits:
- All topics entered as values of this property are typed as directors, which is a useful piece of information that can be used in queries (such as "find me all German film directors")
- When the user enters a value, autocomplete matches against topics of type "Film director" first before all others. This makes it more likely that the user's choice will appear at the top of the list.
- You can have a "reverse" property from Film director that shows all films he directed. If you had used the more generic "Person" type, it wouldn't have made sense to have such a reverse property because most people don't direct films.
Avoid types that are too abstract
If you get too far into the "ontological" frame of mind you may be tempted to create types that are very abstract. For instance, if you were designing both the Film and TV domains, you might want to create a more generic Actor type in addition to the Film Actor and TV Actor types. When you consider doing this, though, you should try to imagine what properties you would take out of Film Actor and TV Actor and put into the more generic Actor type. In this particular case there really are no properties that are common between Film and TV actors that aren't already part of the Person type. For this reason the generic Actor type serves no purpose other than defining an abstraction with little practical value.Since non-technical users contribute to and use the data in Metaweb, the data model should be kept as straightforward as possible. Abstraction and complexity have higher costs than they would in a system built and maintained by trained ontologists or software engineers. If a type serves little practical purpose, it should be avoided.
Use SI units
Until recently, Metaweb required that values with physical units use a floating point number, with the unit type added as part of the property name such as "Weight (kg)". Now when a property has an "expected type" of "floating point number", you can select a type of dimension (such as "weight", "distance" or "temperature") and a unit within that dimension ("kilograms", "meters", "celsius").Since you are defining the type at the schema level, you are proclaiming what how all values of that property should be expressed. If you set it to "Liters", for instance, all values of that property must be normalized to liters when they are written.
All public types should use SI (aka "Metric") units, where they can be used (almost everywhere). The Freebase application will at some point provide unit conversion in the UI for users of other measurement systems (such as Americans, Martians, etc.)
Not every object in Metaweb is a Topic
Many objects in Metaweb are not significant enough to be talked about, have a description, an image or even a name. In fact, the majority of objects in Metaweb are not topics. Generally, compound values such as film performances are not topics and have no names. However, in some cases when it is needed, an instance of a mediator may be promoted to be a full topic by co-typing it.Use Compound Values to show more complex relationships
If you wanted to show which actors appeared in a film, you might create a property in the Film type called "Actors" that expected objects of type Film Actor. This is a very simple model that defines a direct relationship between the two types. However, if you also wanted to store the name of the character the actor played in the movie, a simple relationship wouldn't be enough. Instead, you would need an object between the Film and Film Actor that has a property that stores the name of the character. This new object is a compound value that it is situated between the two types, defining a more complex relationship. In this particular example, the object type would be a Film Performance.Another example of a compound value is the Marriage type, which connects two Person instances together and holds the dates the marriage began and (optionally) ended.
In the Freebase application, compound values appear differently than other objects. Normally, when viewing a topic, you see only the display name (link) of linked objects or literal values, such as the director's name or the film's release date. However, if there is a link to a compound object, values within that object are shown instead of a display name. For the Film Performance example, instead of a display name, the actor name and the character name are shown instead.
To make a type a compound value, click on the checkbox next to "Compound Value Type" at the top of the Schema Editor page,
and then be sure to check the "display as disambiguator" box in each property that you would like to show in place of a display name when viewed from other objects.
Compound Values can also be used for basic system types that are used by many types. The Money Value type is a compound value that stores the currency type, the amount and a date the amount was valid.
Include properties that are likely to be covered
There is little technical cost to adding additional properties to a schema. If an instance of the type has no values for a particular property, no resources are consumed. That said, adding a property to a schema incurs another cost -- that of the attention of the people who are adding, querying and viewing data. When you add a property to a type, try to imagine how much it will actually be used. If you anticipate that the ultimate coverage will not be good (say, greater than 30% of all instances), then consider adding it to a co-type.For example, a member of the Dining domain might want to add an idiosyncratic property on the Restaurant type that records whether or not squeeze ketchup bottles are used in the restaurant. Although this may be very important for a small set of users, it's unlikely to be filled in for most restaurants. It would better be encoded in a private co-type like "Squeeze bottle friendly restaurant" and used by the narrow community.
However, some properties can be conditional
Sometimes it makes sense to include properties that may be used by only a fraction of a type's instances. For instance, the US Political Office Held type in the Government domain includes several conditional properties only one of which is used for a particular instance:Without these optional properties, there would have to be three separate Political Office Held types for State, City and County. Worse, there would have to be three properties on "US Politician" pointing to each of these. Splitting these apart would make the data model more complex and make querying for a politician's history more difficult.
Data modeling is about these kinds of tradeoffs -- sometimes its better to slightly inconvenience a data contributor if it makes the data model easier to understand and query.
Recent Discussions about Data Modeling Guide
So Started in on Populations In time
"So I'm Intrested in seeing populations numbers in time having a good place to exsist... I started..."

