Craft Sentences
- Understand the principles of crafting good sentences about data
- Distinguish formal and informal writing styles
- Practice revising sentences for clarity and style
For more information about the topics covered in this chapter, refer to the resources below:
- Communicating with Data book (html) by Nolan and Stoudt
Crafting Sentences
Text and examples taken from Communicating with Data by Nolan and Stoudt.
How can we strength our writing about data? Let’s talk about some general principals for crafting sentences
Straightforward Sentences
Scientific writing aims to be precise and concise.
- Remove empty phrases that contain no information, eg, “of course”, “clearly demonstrate”, “it is obvious”, “is well known”, and “it should be pointed out that”
- Trim fat phrases, eg, “has the ability to”, “in the even that”, and “as to whether”
- Use active verbs instead of passive verbs, eg, change “research program has the aim to develop” to “research will pursue”
- Reduce strings of modifiers since too many adjectives and adverbs can make a sentence hard to follow
- Avoid cliches and colloquialisms, eg, “the ball is in your court” and “the elephant in the room”
- Vary sentence structure and length
- Straighten out convoluted sentences by avoiding too many prepositional phrases
Example 1
Notice the difference between the original sentence and the revision. Why is the revision better?
Original: “In this part of our analysis, we assume that flight delays that last shorter than 15 minutes have minimal effects on passengers, and so we reduce our large dataset into a smaller subset in which all departure delays are at least fifteen minutes long.”
Revision: “Since short departure delays have minimal impact on travelers, we analyzed only those flights where the delay was longer than 15 minutes.”
Word Choice
Use nouns as stepping stones in a sentence and active verbs to help you move from one noun to the next.
- Use concrete nouns and avoid imprecise pronouns “it” or “this”
- Use strong verbs and avoid weak verbs such as “is” or “have”
- Use active verbs. If you are afraid your sentence is in the passive voice, add the phrase “by zombies.” If the sentence still makes sense, it is in the passive voice. (Source: https://waldenwritingcenter.blogspot.com/2014/06/how-zombies-can-help-you-avoid-passive.html)
- Match word connotation with context, eg, avoid words with negative connotations in a context that is positive or neutral
- Be careful when swapping with synonyms–not all synonyms are interchangeable
- Avoid overly complex words, eg, words ending in ‘ize’, ‘ization’, or ‘ability’
- Remove redundant words, eg, “mixing together”, “already existing”, and “introduced a new”
Example 2
Notice the difference between the original sentence and the revision. Why is the revision better?
Original: “Thanks to my model’s output I was able to determine that there is a significant relationship between a mother’s smoking behavior and her baby’s weight.”
Revision: “The model output showed a significant relationship between a mother’s smoking behavior and her baby’s weight.”
Grammar Details
Grammatical details can make a big difference in the clarity of your writing. So, check for:
- Subject/verb agreement
- Complete comparisons, eg, “better than any other” should be “better than any other model”
- Parallel structure, eg, “to analyze, to interpret, and to present”
- Lack of sentence fragments, eg, “The statistical test may be used. But only when the data is normally distributed.” should be “A statistical test may be used when the data is normally distributed”
- Consistent verb tense / correct part of speech, eg, “Some researchers feel badly about the treatment of lab animals” should be “Some researchers feel bad about the treatment of lab animals”
- Correct use of prepositions, eg, “The data was collected off the participants” should be “The data was collected from the participants”
- Avoid run-on sentences
The free-version of Grammarly.com is helpful in catching many grammatical mistakes.
Example 3
As a group, try to write a more straightforward version of one of the following sentences:
Original 1: After understanding interaction between peers, I am interested in investigating the impact of video games and televisions on math scores; whether spending more time on entertainment would help students perform better in math.
Original 2: According to all the findings regarding seasonal effects on delay, it is reasonable to recommend passengers to reduce their times to take a flight in the summer and winter season.
Original 3: The data are scraped from San Francisco Chronicle weekly. However this feature is provided by the California Resource, a title abstracting company. And all the detailed information are collected from SF Bay Area Counties house transactions.
Writing with Numbers
The following is adapted from Numbers in the Newsroom by Sarah Cohen
Keep the number of digits in a paragraph below 8
- Before: The Office of Redundancy’s budget rose 48 percent in 2013, from $700.3 million to $1.03 billion.
- Revision: Over the past year, the Office of Redundancy’s budget grew by nearly half, to $1 billion.
Round a lot
- Only use precision when it matters, eg, never round when it comes to death counts–every body matters.
Think in ratios
We can’t think clearly about very big or very small numbers. Make the numbers you deal with understandable by contextualizing them.
Example: A widely quoted estimate of “fixing the Year 2000 computer bug was $50 billion for U.S. companies.” How big is $50 billion? At the time,
- It was smaller than Bill Gates’ net worth
- It was the cost of two hurricanes
- It was the income of people living in the Portland, Ore., area.
Example: The nonpartisan Congressional Budget Office estimates that Trump’s 2025 domestic policy bill will add at least $3.3 trillion to the national debt over the next decade. How much is $3.3 trillion at the time? Adopted from the NYTimes’s July 1, 2025 newsletter, here are six ways to think about this number:
- Say you start playing the lottery the day you’re born. If you somehow managed to win every game in every U.S. state, every single day, it would still take you around 75 years to rack up $3.3 trillion.
- It’s enough to buy every piece of real estate in Manhattan twice.
- It’s more than the combined wealth of Musk, Larry Ellison, Mark Zuckerberg, Jeff Bezos and the next 18 richest people in the world.
- If distributed evenly, it would be enough to give every U.S. household more than $25,000.
- Broken into $100 bills, it would create a stack 2,200 miles high — far beyond the orbit of the International Space Station. Laid end to end, those bills could wrap around the Equator 128 times.
- If you spent $1 every second without stopping, it would last more than 104,000 years. If you spent $1 million every day, it would last more than 9,000 years.
Use devices from everyday life
Most people have some arithmetic that they perform instinctively, eg, discounts in retail, tipping at a restaurant, 2 to 1 odds. Convert your writing into the commonly used scales. Keep in mind that a percent change, which implies multiplicative change, is very different from a percentage point difference, which implies additive change. Both are correct but emphasis different things as shown in the examples below:
Population growth has slowed by about 1.3 percentage points since its peak in 1950, to 0.7 percent. (Simple difference between 2 percent and 0.7 percent, expressed in percentage points.)
Population growth in the U.S. slowed by almost two-thirds from its peak in 1950, to 0.7 percent in 2013. (Percent difference between 2 percent and 0.7 percent)
Writing Style
Good scientific writing explains concepts clearly and shares information to a broad audience. For most writing about data, you aim to describe what you found out about the data and the larger context.
One of the most important things to consider in writing is the audience and goal of the communication.
- Audience: What can you assume they know about the data context? What can you assume they know about statistical methods? What data visualizations can you assume they are familiar? What are their goals in reading your writing?
- Goal: What do you want your audience to do with the information you provide? What do you want them to take away?
Describing Findings
Depending on the audience and goal, you may want to describe your findings in different ways.
- Path you Traveled: Describe the steps you took to get to your findings. This is useful when you want to show the process of data analysis and document each step. This is typically only relevant if the audience is a fellow data scientist or statistician. If you are the main audience for the writing, this is a good way to keep track of your own work.
- What you Found: Describe enough of your process to get to your findings. The goal is for a reader to understand how to reproduce your work; commented code makes the process fully reproducible. This means you won’t necessary describe the process in the order it was carried out in chronological order and some tangential exploration may not be included. This approach is useful when you want to show the results of your data analysis. This is typically relevant if the audience is a broad or has more expertise in the data context. This is a good way to communicate the results of your work to others.
Informal v. Formal Style
Depending on the audience and goal, you may want to write in different styles.
- Informal: This style is more conversational and is often used in blog posts, emails, and other informal writing. It is often more engaging and can be more fun to read. The style allows the writer’s personality to come through. You can often break some of the principles of crafting sentences in this style:
- Fat Phrases: You can use more fat phrases and colloquialisms as it provides the reader insight into the writer’s thought process.
- Path you Traveled: You are more likely to describe the path you traveled in this style.
- First Person: You can use first person (using “I” and “me” in the writing). You as the writer can insert themselves in the story.
- Conversational: You can use contractions and more colloquial language.
- Formal: This style is more precise and clear. It is often used in academic writing through journal articles, technical reports, and other formal writing. It is often less “fun” to read. However, it is more precise and clear. You should follow the principles of crafting sentences in this style:
- Trim Fat Phrases: You should remove fat phrases and empty phrases.
- What you Found: You are more likely to describe what you found in this style.
- Third Person: You should avoid first person (using “I” and “me” in the writing). The writer should not insert themselves in the story.
- Precise: You should avoid contractions and colloquial language.
Exercises
Exercise 1: Change style
Reading the following blog post then With your group, identify phrases you would remove if you wanted a more formal writing style.