I read Nikos Kazantzakis’ Zorba the Greek when I was a sophomore in college, and many of the book’s moments are with me still.
I remember an old man reprimanding the narrator, also known as the bookworm, character when he asks what dish was his favorite, telling him it is a great sin to say this dish is good and this dish is bad because there are people in the world who are hungry.
I think about the description of Zorba reaching out his huge hand closing his mistress Boubalina’s eyes with “indescribable tenderness” after she died.
I remember Zorba’s seizing of life at every possible instant, his not taking offense when Boubalina’s parrot calls him by a different name, and, of course, his love of dance.
[youtube=http://www.youtube.com/watch?v=a6K7OC-IKnA&w=420&h=315] Yet one of the strongest memories of the book are when Zorba comes across an old man who is planting an almond tree. When Zorba expresses skepticism that the man will live to ever see a single almond, he tells Zorba that he acts as if he will live forever-a statement that elicits Zorba’s retort that he lives every day as if it is his last.
“Two equally steep and bold paths may lead to the same peak,” Kazantzakis writes.
I thought of the Greek legend’s words on Thursday, when friend, lawyer and professor Macarena Rodriguez and cognitive science doctoral student Alvaro Graves came and presented to the students in my Data Journalism class at the University of Diego Portales.
We're just two classes into the semester at the University of Diego Portales, and I can already tell we're going to have a lot of fun.
Now, I will be honest and say that I’m not exactly sure how many students are in the class at this point.
The class list I received from the department says I've got 16.
Six students attended the first session, two others wrote me explaining why they won't be there for the first two weeks, and eight students, including four who weren't there the first time, went to the second class.
By my reckoning, that makes 12, and I won't know for sure until August 14.
That's the date when the students have to make their final decisions about what they actually are taking for the semester.
Whatever the total we ultimately will have, I can tell we’re in for some lively exchanges and some learning from each other.
In the first class I explained that working with data entails acquiring, cleaning, analyzing, incorporating them into your reporting and displaying them.
There are four major ways to acquire data: writing a freedom of information request; scraping data from websites by writing code and transferring them into a format that can easily be analyzed; downloading existing data; and building a dataset.
Macarena and Alvaro came to talk about the first two options.
Maca spoke first, explaining to the students the origin and key elements of the country’s landmark 2009 transparency legislation.
“There’s no greater disinfectant than sunlight,” one of the slides said.
Macarena proceeded to explain why.
She put Chile’s law in the context of the move by governments around the world over the last 62 year years to institute similar legislation. Finland and Sweden were first in 1951, the United States followed in 1966. Maca also showed a slide of a 2011 world map of the world that indicated by country the states of national transparency laws. (Northern and Central Africa, parts of the Middle East and Asia had the biggest holes.)
Although 11 Latin American nations have freedom of information legislation, she talked the students through the history of secrecy that has shrouded many of the countries before going on to talk about key features of the Chilean law like the transparency council that decides on individual requests.
The students peppered her with questions about the council’s composition and the types of records that are subject to the law.
Although the volume of questions meant did not have time to see the sample of a successful information request that Maca had, she has agreed to look at their letters to help refine and make them as precise as possible.
Precision is a critical part of scraping, and Alvaro talked the students through what he and other members of the winning team in a recent Scrapeathon here in Santiago. (For those who don’t know, a scrapeathon is when teams compete in a specific amount of time to pull data from a publicly available site, organize them into an analyzable file and then build some sort of visualization from it.)
Alvaro and his team were interested in looking at school quality in Santiago.
They used the SIMCE, a single number published by the Chilean government that ranges from 200 at the lowest to 300 at the highest.
After pulling the data, the team then merged that information with geographic location and plotted the points on a map using a free tool from Google.
That was just the first phase for the team.
They then moved to show the amount of distance students would have to travel and money parents would have to pay by neighborhood to go to schools of varying quality levels.
The point, unsurprisingly, was that parents in poor neighborhoods would have to pay more and have their children travel farther to have their students attend high-quality schools than their wealthier counterparts.
Again, the students lobbed a series of probing questions at Alvaro.
How did you know where in the neighborhood people live, one student wanted to know.
Alvaro explained that he and the team had scraped the data, joined it and built the site in eight hours, adding that the code they used was open source and available on their Github repository.
The team plans to refine the project, he said.
Time was running very short.
I reminded the students that while we were going to hear from many American journalists during the course, we were starting with Chilean professionals who had studied in Chile and the United States, were available to them as resources and who are in different ways committed to bring the truth about their society to light.
I also repeated that the students’ assignment was to write a 500-word analysis of the advantages and disadvantages of each method of data acquisition.
One of the students who had taken notes for the class said they were about 500 words and asked if he could be exempt from the essay.
No.
I took a few pictures of the speakers and students.
Maca zipped out of the door and onto her next task. Alvaro lingered for a while.
Several students asked again to clarify the homework.
Writing freedom of information requests and scraping data may not be the stuff of life and death that Kazantzakis wrote about in his epic novel, but they are different paths to reach the same goal.
On Tuesday we'll see where the students land.
We'll tally their arguments into a list in a Google Spreadsheet, thereby showing them how to build a database.
I can’t wait.