Discussions on Publishing

The Book property on Author should not have an Author entry box

Book is set to display as disambiguator. That works fine for Book Editor but for an Author topic it doesn't make sense, and causes a bug if the author is entered.

Fixed.

Different type for collections of poetry, short stories, essays, etc.

I'd like to be able to enter the individual poems and short stories published in a book as properties of that book. For example, the book "The Works of Edgar Allan Poe — Volume 1" would have the following: http://www.gutenberg.org/etext/2147

The current Book type doesn't allow it, and perhaps as it is it wouldn't make sense to just add a "contents" property for Book , since all novels published as books just contain themselves. So maybe a new Type or Types are in order:

Type: Collected Works (need a better name), parent type Book , property Content , expected type: something that covers all writing pieces that could be collected: poems, short stories, eassyas, etc.

Thoughts? Thanks.

I'm currently working on this very problem. You've identified some of the main issues around this problem, and they're proving surprisingly hard to work out, especially since the same structure will have to support magazine issue contents, etc. And don't even get me started on serialized books.

I was running into a similar problem regarding SF story collections and being able to link up short stories to collections. How about Anthology type?

What I'm trying to do, ultimately, is find a universal way to handle contents in a book (magazine, etc.), since the contents of an anthology, collection, omnibus, or magazine issue can include stories, poems, whole books, plays, and a bunch of things we haven't even typed yet like comics, comic strips, essays, etc. So a co-type with a "contents" property is one of the things I'm exploring, but the contents has to accept multiple types for it to be at all useful (unless we go with the very ugly method of having "story contents", "poem contents", "play contents", etc. for all instances). I have a few ideas about how to make this work, but it's taking longer than I had hoped.

How about having a parent class that all contents: stories, poems, plays, etc. are subtypes of. That type would then be the expected type for property contents.

Has there been any resolution on this? There are some books which are really just collections of short stories that I'd like to enter.

You might want to check out the Published work type and the Short story type. Also, there is a Help Center topic on Entering the Contents of a Book or Periodical.

Poetic school or movement seed data entered

I've seeded the Poetic School or Movement type with 15 movements that should cover the basics. Please associate this info with your favorite poet or poem, if applicable. Thanks!

"Magazine" and other print publishing topics?

Any plans to add "magazine", "newspaper", "periodical", "journal", "monograph", and other publishing types? (E.g., I suppose we'll need "magazine editor", "newspaper editor", "journal editor", etc.)

Yes, once we get the problems with representing contents worked out, we'll be able to put in other publishing types. But we want to hold off until the contents model is more-or-less set so that we don't have to refactor a bunch of data or types to make them use it.

Genres? Like in Film?

Ben Hur: Historical Fiction
Count Of Monte Christo: Adventure Novel
The Big Sleep: Crime Story/Hardboiled/Mystery

etc.

I see, you have it under subject, that's a bit different than how Wiki and IMDB and some other book sites have it...I guess Freebase is going by the American Library system.

Our original intent had been that the "subject" type could include genres as well as Library of Congress-type subjects. If enough people think that having genre be a separate type would be better, though, we can certainly change that.

I would favor genres being separate from subject. While there may be some crossover between the two, in many cases the two are not the same. Foucault's Pendulum (Eco): Genre: Fiction; Subject: Templars.

Note that Films and TV Series have both Genre and Subject so it seems apropos to do the same with books, comics/graphic novels, poem and maybe even song.

Makes categorizing a bit more complex but then correctly placed within subject matter and genre, finding a specific instance (or a range within a subject and/or genre) will be all the more reasonably easy.

I've gotten some comments off-line on this as well, so I'll genre to book and short story now.

Literary movements

Silly me, when I suggested Poetic School or Movement I had tunnel vision on poetry, completely forgetting that creating a sister type Literary Movement would be just as useful. Using my favorite author as an example, Edgar Allan Poe would fall under with Dark Romanticism. Perhaps one can be added now? The schema would be very similar, substituting associated Poets with Authors and replacing Poems with...Books?

Eh, maybe not books, or else short stories (and essays, etc.) can't be included. Really need the umbrella type for literary work that I think Jeff is working on. Perhaps the work examples part of Literary Movement can be omitted for now.

Books I've Read

How would you suggest organizing data of the sort "Books I've Read" or maybe more broadly "Media I've Reviewed"? My first thought was to create a create a "Media List" type that ties a person and a list of media.

We don't have a generic type for all media, and I doubt that we will any time soon (if ever); however, you can create a property with an expected type of /common/topic, which will accept any type as input (I'm pretty sure that's how the property "favorite topics" on the user homepage works), so topics of different types could be listed under one property. This works only if you don't want the reverse property ("people who've read this" or whatever) to appear on the topic.

I also worked out a model for reviews/reviewers in my private domain, using a different approach, which does allow for the reverse property ("reviews of this work", in this case), and which should be extensible to other things: http://www.freebase.com/view/filter?id=/user/jeff/publication_test/review. Basically it's three types -- "reviewer" which would be cotyped to a person, "reviewed_work" which would be cotyped to anything that can be reviewed, and "review", which connects the two.

Does this answer your question at all?

Normalization of Publication?

Hi, just joined today and was looking over the schema for short stories in particular. I'm very interested in models that capture what writers need. Currently, a publication lumps some important values together. As an example, you can look at David Levine's short story which is associated with Publication "F&SF July 2007". I'm creating some private types that move the issue information (July 2007, volume #, etc) into the join model (what you call "contents" right now). This will allow a publication "Magazine of Fantasy and Science Fiction" which can be aliased to F&SF. All short stories published in that market would be associated with a single instance instead of partitioning across every issue.

I'm building Writertopia.com, an online writing community, and there are a number of schema issues that have come up. I'd like to help populate the publishing database if we can get a schema that's useful for the writers associated with my website.

Moving the issue data onto the "contents" mediator is an interesting suggestion, and could reduce some of the complexity of the model. I'm not sure how would this work with works that have been published in non-periodical publications like books. Being able to identify the cover artist of magazines is also valuable information, which we would lose by eliminating the periodical issue types.

Jeff, I'm playing with a slightly different typing scheme than your domain's Published Work - Contents - Publication models. I've labeled my types as Published Work - Publication - Writing Market, where a Writing Market can be used as a co-type for an anthology, e-zine, magazine, etc. (Writing Market would include things of interest to writers like pay rate, submission info, and acceptable forms of story length and genre.)

A Published Work can be published in a number of ways. There can be reprints/translations/serialization into pages of various media. I've changed "Contents" to "Publication" because I think the meat of the modeling is in the join model, the connection between a work and where it's been printed. I think that join model might be a good place for the what/where/when/how info. I'm not sure which Publishing domain models are covering periodical issues. Could you give me a pointer to the types I should look at?

You raise a good point about the cover artist. It may be useful to have an object represent a single issue of a magazine, a day of an e-zine, a year of an anthology series. That would call for the creation of another model, say an Issue, that a Publication would optionally reference instead of directly linking to a Writing Market. That would make something like: Published Work - Publication - Issue - Writing Market. This would make certain queries easier but also complicate the schema. It would let you easily list all the issues of a magazine. If the issue information were redundantly embedded in the Publication join model, we'd have to extract it, which might not be that bad either. If the issue information stays in the join model, you could create a model for each "job" a cover artist takes -- call it Cover Art Work. The Cover Art Work could either have issue information, or if there was a separate issue model, it would link directly to that.

I think the Publication model could have some "if" fields and only capture pertinent information. A publication can have fields for Volume, Year, Month, Day, Issue #, and so on, and each publication would only use some subset of those fields.

This discussion also touches on a big issue I see with Freebase: how to best create schema when the intended use will vary substantially across users. Writers will want information that's irrelevant to readers, and you want to make the schema simple for readers but still support publishing "business" information. I think the co-type mechanism might be the key.

The periodical schema I've modeled is here: http://www.freebase.com/view/domain?id=/user/jeff/periodical_test (the Levine story you mention above links into this schema, but I had neglected to check "publish this type" for all the relevant types, so it probably didn't display very well, if at all). Play around with it -- I'm very interested to hear what you have to say about it.

You've hit the nail on the head with your comment about intended use. It can be a big design problem, especially in something as complex as publishing. Co-types are often the answer, although sometimes, as you suggest, using optional properties is sufficient. There are a lot of potential users for publishing (readers, writers, collectors, and researchers come to mind, and there are probably more), and we definitely haven't met everybody's needs yet.

One thought I just had, and I haven't had a chance to explore this at all thoroughly, is that "writing market" could be a co-type on some types of publications (magazines and newspapers, anthology series, etc.), but not on all, so that we can model, for example, an essay written as the introduction to a novel (not a market), republished in a magazine (market), printed in a year's-best anthology (market), and finally collected in a single-author collection (not a market).

One note I have is that we should avoid having redundant data in multiple types wherever possible -- it's too easy for the data to get out of synch, and it complicates queries by requiring users to query two or more objects for the same information.

Jeff, I'll look over the periodical_test domain. I already see I'm mistaken in thinking that "SF&F April 2007" was an unnormalized entry -- it's a display of several fields. Great. Is there a tool to see the nuts and bolts of where fields are drawing their contents? That would help the data modeling part.

The new types in Periodical Test domain look promising. Would a Magazine or Newspaper be non-periodical? (Guess I wonder about modeling that as a co-type instead of an included type.) I like the way you've created an "Issues Per Year" type that includes a date range. Very cool. I'm not sure about the grouping of the properties across the objects. For example, should Issues Per Year be in Magazine or placed in Periodical and Magazine includes the Periodical type?

I'll start putting my comments directly into the "discussion" area for the types.

Is there a tool to see the nuts and bolts of where fields are drawing their contents?

Bill, whenever you see a multi-valued entry in a property list—like “SF&F - April 2007,” and where you can click on each of the components separately—you can also use the drop-down menu that appears when you mouse over the compound value. From that menu, you can visit the compound value directly, and see or alter each of its component values directly.

Both newspapers and magazines have periodical as an included type. The way included types work is that whenever you create a new instance of something that has an included type, that included type is automatically applied as a co-type. So when you create a new magazine, it is automatically co-typed as a periodical.

Jeff, I was confused by the way types are listed for topics. For example, the Magazine of Fantasy & Science Fiction (http://www.freebase.com/view?id=%239202a8c04000641f8000000000222d0b) has types listed as Magazine (Periodical Test), Periodical (Periodical Test), and Employer (Business). I incorrectly assumed that being listed as a type meant co-typing and included types were assumed to be part of the subclass. In other words, for some reason with my previous OOP background, I was expecting F&SF to have types Magazine (Periodical Test) and Employer (Business) if Periodical were an included type, a "base class" of Magazine. I see now that both Newspaper and Magazine have included Periodical, and "Issues Per Year" is at the Magazine level and not the Periodical level. I'll try to stop misinterpreting how the schema is really setup :)

I would suggest some UI way of differentiating where types are applied to topics. For example, F&SF has the three included types, which could have come as co-types right on the F&SF topic (if there was no inclusion of Periodical into Magazine) or the current case of being a Magazine (+ Periodical by inclusion) and Employer. It could be as simple as color grouping types or as sophisticated as optionally popping-up a window to display relationships graphically, like MySQL Workbench for Freebase (http://www.mysql.com/products/tools/workbench/ see GUI pic on right.)

I'm not sure I follow you -- are you suggesting a way of indicating which properties belong to which type? If so, it would be worth mentioning on one of the feature request discussions, since that's well out of my purview or expertise. But if you're looking for an indication about HOW a type came to be applied to a specific topic, I think the answer is that Freebase (the application) doesn't care -- the effect is exactly the same whether a type is applied as an included type or added by a user. (Well, the order of type creation does affect the order in which properties are displayed in the UI, but that's just a UI convention.) The properties, and their relationships to the topic and each other, are identical regardless of how the types are added.

Jeff, I'll talk with Patrick on the issue -- it's more a UI/infrastructure than a publication schema suggestion. I think it'd be nice to see where properties are coming from. I'm also looking for an indication how a type became applied to a topic, mainly because it becomes clearer how to create new types and how I add types to new topics. If I add a new Anthology type, the Anthology type should include Periodical type instead of expecting the user to apply a Periodical co-type directly to a topic of type Anthology. Or when I'm adding the Zoetrope All-Story topic, I should add type Magazine (which automatically includes Periodical) instead of adding Periodical. The more users understand the schema, the less likely they'll break convention and take a different route to typing. Of course, a sufficiently interested user will drill down into all the type definitions and add the correct types, but the easier you make it to understand the schema (at the topic page, the type list page, etc), the more likely you're going to get correct entries.

Bill, you are not the first to request an indictation of which properties are associated with which types. This is a UI feature request. I hadn't thought about it helping schema editors to understand the co-typing system (and to discover appropriate types to use as co-types). Good point.

For those of you just tuning in, here's how co-types work.

When I create a schema for a type, in that schema I can specify which co-types will also be assigned. Co-types are used to capture more general elements of a type. For example, an actor type can co-type as person so that 'actor' can be about those things unique to acting, and the more general properties of an actor - their birth date and birthplace, for example, can go into the person type. For a user who isn't editing schema, this happens more or less transparently. For someone doing data modeling, however, they have to create these co-type relationships and be aware of what types fit together or are generally useful (person, for instance).

Let's say I create a schema definition for a type, call it Type A, and in this schema definition I specify two co-types, call these Types B and C.

When I then type a topic as A, the co-types B and C are automatically applied as types to the topic. However, in the user interface (and API) I can remove type A, B, or C from this topic independently as co-typing of a topic happens when the type is initially applied to the topic, and there is no enforcement of this co-type after the initial application. In other words, the co-type statement in the schema is a suggestion, rather than a requirement.

Here's a concrete example: the Film Actor type ( definition) has a co-type ("included types") of Person ( definition )

So if I type 'Ronald Reagan' as a film actor, he is then also typed as a person. All well and good.

If I type ' Lassie ' as an actor, then Lassie is co-typed as a person - not good. But I can delete the person type from Lassie without affecting the actor type (and vice versa).

... the Film Actor type ( definition ) has a co-type ("included types") of Person ( definition ) ...

Love the featured topic! ;)

Yet another bookmark for my most frequented topic! ;) Thanks, Jeff...

Book/Story Location?

Hi, the Film Domain has a Film Location type that can be co-typed to places films are made, such as San Francisco for films like Vertigo. I'd like to see a similar type in the Publishing Domain to capture actual locations used in published work. I'm approaching this with fictional work in mind, but I suppose it could be applied to nonfiction as well.

There is a "fictional setting" type in the fictional universes domain, which can be used this way. The way to add the property to a work of literature is to add the type "work of fiction" to the topic in question. The fictional universe domain is pretty new, and we're still figuring out parts of it, so the documentation hasn't filtered out to related domains. "Short story" now has "work of fiction" as an included type, but existing stories aren't grandfathered in, alas.

Are we distinguishing between a real place (Berkeley, CA) used as the setting of a fictional work vs. a fictional place (Neptune, CA) invented for a fictional work? Is "fictional setting" to be used for both?

Type for segments in religious books

Books like the Bible or the Book of Mormon are compilations of various sources, so their structure is segments with chapters inside each division. It would be nice having a type for this, such as "Segment (Religious text)".

Under the main entry data it would have fields "author", "time of writing", "described time period", "notable characters", and on the right hand side "contained in" to refer to what book(s) it'd be in.

(Or something similar, I'm open to suggestions. I find a need to distinguish because it appears like there are two religous texts called "Book of Mormon" where one entry describes the whole book and another describes a segment in the book with the same title).

This is a good idea; over in the religion domain, there's a "religious text" type. I could see something like this in either the publishing or religion domain. I'd try modeling it first in your private domain and see what you can work out. Feel free to ask here (or in the religion domain) for help, advice, etc.

Hi necz0r, I see what you're trying to model here. Before you come up with new schema, take a look at our existing types to see if they can be used to suit your need.

The "Religious Text" type in the Religion Domain is intentionally not named "Religious Book" so that it could apply to everything from a single article to more "anthology"-like volumes like the Bible. Topics can have multiple types in Freebase, as they do in real life. So you can co-type the Bible as Religious Text, a Book and Publication, thus allowing applicable properties to be entered. A Publication has contents that could be separate works from different authors, which suits religious texts such as the Bible. There is also an existing Excerpt Type in the Publishing Domain that can be used to indicate segments in religious books. BTW, another type that might interest you is Translated Work in the Publishing Domain. Religious text is often well translated into many languages. If you have data in that area, I encourage you to add them to Freebase.

Type for physical storage locations of publications

I would like to see some Types added to the Publishing domain that represent the physical storage locations of publications:

Institutional repository (there are several existing Topics that I will suggest be merged)

Library

Warehouse

Thanks!

What sorts of properties do you think they should have? Or, perhaps another way of asking would be, what sorts of data are you interested in storing? The quickest thing would probably be to model them in your private domain. Once you're happy with the model, we can see about moving it to the root level domain.

Will attempt to model physical storage locations of publications privately

Thanks Jeff. I'm working on an article about Freebase for Searcher magazine (http://www.infotoday.com/searcher) and I think I'm getting the hang of things here. Likely what I'm asking for are already available as types under the System category of domains.

Capturing the pseudonym under which a work was published

Is there an existing property for indicating, for example, that Sylvia Plath's novel "The Bell Jar" was initially published under the pseudonym "Victoria Lucas" (later publications have used her real name)?

Not yet. We had a model, but we never got around to debugging it and putting it up. We'll get to it eventually, though!

Plans for scraping publishing data?

A quick search for "an essay by" on Wikipedia.org gives 800 results compared to the 96 pieces of short non-fiction on Freebase:

http://www.google.com/search?hl=en&client=firefox-a&rls=org.mozilla%3Aen-US%3Aofficial&hs=KjC&q=site%3Awikipedia.org+%22an+essay+by%22&btnG=Search

More generally, what are the plans to fill in publishing data?  Can Freebase scrape Amazon  or WorldCat data ( http://www.worldcat.org/ )?

We were waiting until the publishing schemata stabilized before going after large amounts of data. We're mostly done mucking about with it now, so we're going to start doing more with this. In terms of getting Wikipedia data typed correctly, we have some bots that look at categories and infobox templates which should type more of the essays, stories, poems, books, etc. that are in Wikipedia, and in some cases be able to associate them with authors.

We can't get Amazon or WorldCat data, because Amazon and OCLC don't have licenses compatible with CC-BY, alas. We do have some data from ISBNDB, which we're working on reconciling and loading.

Thanks Jeff!  It's interesting to know what the plans are for these data loads.  It's more satisfying to fill in data if you feel like all/most of the work that could have been done by a bot is already finished.

I agree about that!  I can tell you that we don't have any current plans to import written works in bulk other than books, book editions, and possibly academic publications, so poems, essays, stories, etc. that aren't in Wikipedia will definitely be needed. And, for topics we got from Wikipedia, anything that's not in an infobox or implied by a category, will probably not be extracted in the near future. (By "implied by a category", i mean that a category like "Sherlock Holmes stories by Arthur Conan Doyle implies information about the author, the series, and one of the characters.)