News for Digital Journalists

Posts tagged with: Data

May 26, 2010

Investigative Journos: What government data do you want? FCC wants to help

Investigative journalists: What government datasets would you really like to have? Someone from the government is here to help.

Dr. Irene Wu, a researcher at the Federal Communications Commission, is seeking feedback from journalists that would help with the agency’s Future of the Media project. (KDMC covered this project in January.)

Here’s how you can participate…

Wu is compiling a list of datasets that investigative reporters would like to have. This would be ideally be data that is supposed to be public, but in practice is difficult to get in searchable electronic format.

Wu wants to understand not just which types of government data are in demand but what kinds of problems reporters are encountering in trying to obtain data. FCC cannot compel the release of data, but the agency could make recommendation on how to improve the situation.

To submit your government data wish list and complaint list, e-mail Irene Wu.

December 02, 2010

IRE webinar: Understanding, using new US Census data

Later this month the US Census will publish the first detailed demographic data at the neighborhood level since 2000—and Investigative Reporters and Editors is ready to help journalists interpret and apply this valuable data.

IRE is offering for download a 25-minute webinar on using the new American Community Survey data…

The American Community Survey gathers information every year from 2 million households. The December release will mark the first time that five-year ACS data will be available for census tracts and other places with fewer than 20,000 residents. Journalists will be able to examine dozens of topics at the neighborhood level, as well as for every US city, town, school district and county.

Topics include: household income, citizenship status, educational attainment, commuting, health insurance, ancestry, disability, mortgage payment, utility costs, military service and type of employer. Results can be tallied for any level of geography—as well as by gender, race, age and ethnicity.

The IRE webinar explores story ideas, explains the nuts-and-bolts of ACS, and simplifies the confusing data release schedules.

Download the webinar
$5 for IRE members, $10 for nonmembers.

January 21, 2011

US Census upgrades American FactFinder tool, new data coming soon

Many journalists have long relied on the US Census’ American FactFinder online tool to analyze Census data. This week, that tool received a major facelift—and it soon will be populated with data from the 2010 Census…

The new American FactFinder features more ways to search, and more ways to manipulate tables and map data.

Table-related upgrades:

  • Customize table views
  • Sort and filter columns of a table
  • Transpose rows and columns
  • Save customized table

Map-related upgrades:

  • Select geographies from the map
  • Create maps from a table
  • Place labels and markers on maps
  • Download maps as PDFs

Coming soon:

  • Transpose rows and columns
  • Bookmark, download, and save/restore query

Take a virtual tour and read tutorials.

There’s also a guide to building deep links into American FactFinder. If you have existing links to data in the old FactFinder, the Census site warns: “The current American FactFinder will be discontinued in the Fall of 2011. At that time, any deep links into the discontinued system will no longer work.

Data from the American Community Survey, the Economic Census, and Population Estimates will be moved to the new American FactFinder “in the coming months,” says the Census site. For now, you can access that data via the existing FactFinder interface.

February 15, 2011

The booming data business: Report, conference explore emerging options

News organizations generally don’t think of themselves as data companies, but they are—or at least, most have the potential to develop this business alongside their news and other offerings. A new report and upcoming event from Giga Om could help news orgs figure out where data opportunities might lie, and how to capitalize on them…

>The report Big Data (available to Giga Om Pro subscribers, 7 day free trial) covers the equipment and systems needed to store and manage large databases—or especially complex ones, as might be generated from a content management system and archive of decades’ worth of news stories, or from the web analytics for a complex, dynamic site.

Better data management tools can help journalists and editors analyze or visualize complex issues, especially those buried in unstructured information. It can make your publishing efforts more scalable. And—perhaps most importantly to the news business—it can support advertisers through data, analysis, and services.

These topics and more will be discussed at GigaOm’s March 23 event in New York City, Structure: Big Data 2011. One theme of particular interest to news publishers is how businesses are spinning out separate companies built around their data. The conference is mainly geared toward CIOs and technologists, but news publishers and technology managers might gain strategic insight here.

March 22, 2011

Why “data journalism” is good for the news business

Data journalism—where presenting data in useful, compelling ways becomes “the story”—is a growing part of journalism. In a recent article, UK data journalist and educator Paul Bradshaw explains how it’s also good for news organizations and journalists…

In Data journalism: Is it worth it? (published in InPublishing magazine), Bradshaw notes:

“When Simon Rogers first asked to publish data on the Guardian website, someone asked: ‘Who on earth would want to look at a spreadsheet online?’ It turned out that over 100,000 people would regularly hit the website to do just that. One person’s audit, it seemed, was another’s sticky content.”

Interactive presentations of data—from searchable databases to interactive data visualizations and more—have become a proven way to drive traffic and increase audience engagement. Unlike story-format news content, this traffic spike often lasts not just for a day or a week, but for weeks, months, or even years. And high-traffic pages mean higher ad rates.

Data journalism also includes data about journalism—and about any other kind of content that news organizations publish. This represents still more news business options.

Bradshaw recommends that news organizations create Application Programming Interfaces (APIs) to distribute structured data about their content, so that others can repackage it or create a mashup by adding data from other sources. APIs attract the attention of programmers, who can create new and innovative data-based tools, experiences, or services that news organizations can implement—and which can be another channel for advertising or other revenue streams.

Bradshaw notes that the Guardian offered “Hack Day” events where programmers collaborated to develop products based on Guardian APIs. These events “led to all sorts of outcomes from personalized mobile editions, applications which would alert people to events and route them to the location, even a tool which suggests recipes based on an image uploaded by the user. The Guardian says they benefit from ‘being able to reach new markets that we might not otherwise find. We grow our vertical ad network through high quality partners [taking part in hack days]. We’re also able to offer our end users innovative, clever and useful interactive services provided by experts outside of our domain.’”

Bradshaw’s article also discusses other opportunities for news organizations to offer data-supported or data-focused services. This insight is useful for developing new business strategies—since in an age where news and other content is ubiquitous, it’s far easier to sell services than content.

March 25, 2011

Everyblock shifts direction, adds local discussion to data

Earlier this week Adrian Holovaty announced the first major redesign of his local data service Everyblock. This site is shifting from being a one-way news feed of local data, to becoming “a platform for discussion around neighborhood news.”

More about these new features…

In addition to adding a big “post” button to pages, Holovaty notes: “We’ve unveiled several new features to encourage positive community behavior. Each user contribution to our site has a ‘thank’ button next to it that lets you give positive reinforcement to the original poster for sharing information. We’ve built a lightweight neighborhood honors reputation system that rewards people for making contributions, as determined by their neighbors’ thanks and a number of other factors.”

Also, intriguingly, Everyblock now allows users to “follow” places, much the way Twitter users can follow other Twitter users.

GigaOm’s Mathew Ingram observed: “I think EveryBlock’s change of heart was a necessary one. I’ve argued in the past that whatever value local news sites have comes not from the data, but from the people at the heart of that community—which is why even poorly designed services that are built by the people in a town or neighborhood are almost always better than services that are set up by companies with a one-size-fits-all approach. History is littered with examples of well-meaning services such as Backfence and Bayosphere that never really connected with the communities they were supposed to serve.”

It seems to me that Everyblock might want to try to integrate more fully with Facebook, Twitter, Foursquare, Yelp, and Flickr, since those services are where so much discussion about community happens. But it would be hard to do that in an automated way. Once a service moves toward hosting public discussion, it really seems to need the hand of a community manager to get the posts flowing, and to keep the flames down. Everyblock will also have to guard against inevitable spamming of its system.

Because of the need of human staff effort to support thriving community engagement services, I’m skeptical whether these new discussion features will last at Everyblock.  But a strategy more based on curating conversations that happen on other sites and bringing that content into Everyblock might be at least partially automatable and thus more sustainable. And there’s room for Everyblock to move in that direction.

Of all these new Everyblock features, I think the most promising is the ability to follow places, and to receive that information as a feed or via e-mail. I live in Oakland, CA—which is just across the bay from San Francisco. SF is an Everyblock city; Oakland is not. But Oakland does have the lovely Oakland Crimespotting interactive map by Stamen Design. I would love to be able to “follow” a neighborhood or area on that map and have it update me with new incidents.

The News for Digital Journalists blog is made possible by a grant to USC Annenberg from the John S. and James L. Knight Foundation.

June 24, 2011

Data Journalism a focal point for latest Knight News Challenge

The Knight News Challenge used its its fifth and final year of grants June 22 to put down a marker—the contest is betting big money on building tools to help make sense of data for journalists.

The Associated Press Overview project, for instance, scored a $475,000 grant to develop ways to scour large databases in order to visualize data and find stories (here’s more on it from AP and Neiman Lab blog). A Chicago Tribune effort dubbed Panda will get $150,000 to build a set of open-source, web-based tools to make it easier for journalists to use and analyze data.

Commentators on Twitter and in blogs immediately noted this year’s heavy focus on data. Matthew Ingram wrote in GigaOm and the New York Times, for example: “There’s a theme running through most of the winners: namely, data as journalism.” The Poynter Institute’s Steve Myers noted almost a third of this year’s $4.7 million in grants is meant “to help journalists and the public organize and analyze data and documents. In different ways, several of these projects seek to solve the persistent challenges of journalists working on investigative and daily stories: how to make sense of vast amounts of data and find the stories within.”

Knight noted the nod to data in announcing this year’s winners, writing that one set of experiments in the latest round would “[h]elp newsrooms organize and visualize large data sets so that they can find relationships and stories they might not have imagined.”  The contest’s overseer and Knight Director of New Media John Bracken blogged that one emphasis was: “The need to make better sense of the stream. News consumers and journalists alike need help making sense of the streams of data now available to us.”

Among the other winners that focused on data journalism are:

  • DocumentCloud, which had already garnered a 2009 News Challenge award to develop document-based reporting software, got another $320,000 for an annotation tool to help crowdsource large sets of documents.
  • ScraperWiki won $280,000 to create a “data on demand” feature to help journalists request and manage data sets, as an add-on to existing services that help journalists and others create data “scrapers” to collect, store and publish public data. The organization will also host “journalism data camps” in 12 U.S. states.
  • OpenBlock Rural gets $275,000 to work with local governments and community papers in North Carolina to aggregate and publish government data.
  • Ushahidi will see $250,000 to develop SwiftRiver, a platform for evaluating crowdsourced information in an unfolding news environment.
  • Spending Stories gets $250,000 to contextualize news by tying it to the data on which it’s based, using automated analysis and user verification.

The latest grants, which this year included an additional $1 million in funding from Google, are the last in a series of 76 projects funded to a tune of $27 million since the first were issued in 2007 [Full Disclosure: This author, with fellow KDMC blogger and colleague Amy Gahran, was a previous winner].

The closing of this stage of the News Challenge prompted some analysis from Knight (including this graphic representation) and from other industry observers on the program’s impact. Neiman Lab’s Joshua Benton blogged: “The entrepreneurial spirit that the News Challenge tried to bring to journalism is far further along, and more players—nonprofits, tech companies, venture capitalists, lean startups, and even those old warhorses in the traditional media—are more willing to try new strategies, throw out old workflows, and build new products and tools.” Poynter Institute’s Jeff Sonderman wrote how News challenge “pushed new approaches for journalism: Crowdfunding, the hacker-journalist, data as news and citizen journalism.”

Knight has suggested the program will reemerge in a new form and is openly seeking input on the News Challenge’s future direction. Observers on hand for the latest round of winners tweeted news, for instance, that the program might speed up its funding cycle by going quarterly instead of annual, and might also earmark a portion for a venture fund.

Check out a full list of this year’s grantees.

The News for Digital Journalists blog is made possible by a grant to USC Annenberg from the John S. and James L. Knight Foundation.

January 06, 2012

U.S. Census: old American FactFinder is retiring, update your deep links!

For a dozen years American FactFinder has been a top online tool for accessing and presenting U.S. Census data. Today announced that it’s finally discontinuing the legacy version of this site, and moving fully to the new version of American FactFinder.

If your site has existing deep links to the original American FactFinder, you’ll need to update them this month to keep them working. Here’s how you can do that…

The legacy American FactFinder site will retire on Jan. 20. At that point, existing deep links to database queries will no longer work. To keep them working, you’ll have to replace them with links to the same queries in the new American FactFinder.

Unfortunately, there doesn’t appear to be an “update” or “conversion” process for AFF links per se. You’ll probably have to simply redo your queries in the new AFF, generate a new deep link for each, and update your site with that link. This could be a bit of a hassle, but for important content it’s probably worth the effort.

See the guide for building deep links in the new AFF.

Not all datasets in the new AFF will be available for deep linking. According to, you won’t be able to link to these historical datasets:

  • 1990 Census
  • 2000-2004 American Community Survey
  • 2000-2001 Supplementary Survey
  • 1997 Economic Census
  • 2003 Annual Survey of Manufactures
  • 2003 Nonemployer Statistics

The News for Digital Journalists blog is made possible by a grant to USC Annenberg from the John S. and James L. Knight Foundation.

February 03, 2012

FreeDive: new searchable online database tool from KDMC-UCB

Today the Knight Digital Media Center at the University of California-Berkeley released a simple tool to make data searchable on the web…

FreeDive uses the Google Visualization API to generate a widget that can be embedded in a website. To use it, post your data online in a Google Spreadsheet. Then, configure your widget and link it to the spreadsheet.

Within your widget, people can search your dataset, view a table of results,  filter those results, and click on column labels to sort the table. Widget users cannot save or print results, however.

The News for Digital Journalists blog is made possible by a grant to USC Annenberg from the John S. and James L. Knight Foundation.

March 15, 2012

New campaign finance transparency tool focuses on issues

This week MapLight launched a new resource for tracking how political campaign contributions influence legislation: Topic Pages, which focus on specific issue areas such as natural resources or education…

With this tool you can search for and federal track bills by issue area in the U.S. Congress, as well as bills in two states (California and Wisconsin). This resource combine all of MapLight’s data related to these topics, with several subtopics under each heading:

  • Budget and economy
  • Business and labor
  • Culture and social issues
  • Defense, national security and foreign policy
  • Education
  • Government affairs
  • Health and welfare
  • Legal affairs
  • Natural resources
  • Science, technology, communications and infrastructure

Each topic page highlights:

  • MapLight’s latest research findings by issue area
  • Industry groups supporting or opposing issue specific legislation
  • Legislators who receive campaign contributions from interests associated with the issue area
  • Industry-specific bills with campaign contributions given by supporting and opposing interests

MapLight also offers an embeddable widget for each topic page, which will display on your site the latest activity surrounding the issue of your choice.

Campaign contribution data comes from the Center for Responsive Politics. Legislative data from

The News for Digital Journalists blog is made possible by a grant to USC Annenberg from the John S. and James L. Knight Foundation.

Page 1 of 2 pages  1 2 >