When Darren Samuelsohn heard "global climate change" during January's State of the Union address, he suspected it was the first time the president had uttered the phrase in his annual assessment of the country.
The Greenwire senior reporter verified his hunch by combing through the six others. And his story was the first to lead with that fact.
"This was a big deal," Samuelsohn said. "While Bush may not have made any major policy reversals on mandatory caps, it put him on record on national TV and before the new Democratic Congress as saying this is a priority for his administration."
It took Samuelsohn about 30 minutes to cut and paste the texts of the past speeches into a Word document and scan them to make sure he was right. But there are easier ways for reporters on deadline to count the incidence of words in the State of the Union or in speeches given by your state environmental department chief, the leader of an environmental group, the mayor, school superintendent, police chief, governor.
It's an analysis that may help you read the tea leaves for shifts in policy or priorities. At a minimum, it provides a fun entry point and fodder for a graphic to spice up a dull speech story.
First, check out http://style.org/stateoftheunion/parse/. It's a nifty parsing tool for counting words in the State of the Union. The comparison of each of Bush's speeches shows an evolution of subjects that are emphasized. Check out words like terror, terrorism, Iraq and war.
You can do the same thing with environment-related words and phrases - energy, ethanol, pollution, nuclear power, global warming. Or contrast words like war and peace or drugs and education.
Most reporters have greater need for analyzing local speeches. Here are two techniques for doing this quickly. One involves a simple spreadsheet. The other uses a speedier Internet-based tool, but you don't get the satisfaction – and the security – of doing it yourself.
The spreadsheet technique: Paste the text into Microsoft Word. Go to "edit/clear/formats" to get rid of formatting.
Call up the search and replace function (control f on PCs; open-apple f on Macs) and replace each punctuation mark with nothing by leaving the "Replace With" box empty.
Replace spaces (hit the spacebar once) with paragraph marks (^p). That puts each word on a separate line.
Paste into a Microsoft Excel spreadsheet under a column labeled Words.
Run a pivot table to count the words. Sort by descending order. Here's how:
Highlight the column including the header and go to Data/PivotTable and PivotChart Report.
Click the "next" button in the first wizard window. Click "next" in the next dialogue box. Click the "layout" button.
Drag the "words" button into the row area of the chart. Again drag the "words" button but this time drop it into the data area. It will change to "count of words."
Click OK and finish. To put the word incidence in order, double click on the gray box behind the word column header. Click on advanced. Under "AutoSort options," check descending. Under "Using field," click on the drop-down arrow to sort by "count of words." Click OK and OK again. The most frequent words appear at the top.
Ignore words like the, and, or, it, they, he, she and others that are not so interesting.
For an automated process, go to www.georgetown. edu/faculty/ballc/webtools/web_freqs.html. Paste text into this tool developed by Georgetown University and it will arrange word incidence alphabetically or by frequency.
If you just want the incidence of a particular phrase, you can always search for it in Word and replace it with something else. A dialogue box tells often the substitution was made.
There is a legitimate argument over whether how often something is mentioned represents the priorities of the speaker. It might be an objective measure. But you'll need your reporter's brain to provide context.
Word counts lend themselves well to graphics. The New York Times used circles of varying size and divided them into categories – domestic affairs, taxes and the economy, terrorism and foreign affairs – to depict word frequency in the 2007 State of the Union. In 2004, the Times used similar circles to depict the incidence of 20,000 words spoken by politicians at both party conventions.
"It doesn't take a rocket scientist to look at one of these circle charts and figure out what a politician's priorities are by the words they use," said Karl Gude, the former information graphics editor at Newsweek who now teaches at Michigan State University. "And that's just what I love about them. They convert a daunting amount of data into a simple and instant read."
If nothing else, counting words is a lot more interesting than the old staple of counting how often a speech is interrupted by applause.
David Poulson teaches environmental journalism and computer- assisted reporting at Michigan State University's Knight Center for Environmental Journalism.
** From SEJ's quarterly newsletter SEJournal Spring, 2007 issue.