Table Data

< Code In The Browser

As another example of how data is stored and manipulated in the computer, we'll look at "structured" data ... strings, numbers, dates organized in the computer. In particular, we'll start with data from the social security administration baby name site.

The baby name data has the extremely common "table" structure:

Much of the information stored on computers uses this table structure. One "thing" we want to store -- a baby name, someone's contact info, a craigslist advertisement -- is one row. The number of fields that make up a row is fairly small -- essentially the fixed categories of information we think up for that sort of thing. For example one craigslist advertisement (stored in one row) has a few fields: a short description, a long description, a price, a seller, ... plus a few more fields.

The number of fields is small, but the number of rows can be quite large -- thousands or millions. When someone talks about a "database" on the computer, that builds on this basic idea of a table. Also storing data in a spreadsheet typically uses exactly this table structure.

Table Code

We'll start with some code -- SimpleTable -- which will serve as a foundation for you to write table code. Run the code to see what it does.


Table Query Logic

The interesting thing to do is write some "query" logic where we pick out just the rows we care about.

The row object has a row.getField("field-name") function which returns the data for one field out of the row. Each field has a name -- one of "name" "rank" "gender" "year" in this case -- and the string name of the field is passed in to getField() to indicate which field we want, e.g. row.getField("rank") to retrieve the rank out of that row.

You can test if two values are equal in JavaScript with two equal signs joined like this: ==. Using ==, the code to test if the name field is "Alice" is row.getField("name") == "Alice"

We can write an if-statement within the for-loop to test for certain rows. For each row, this code tests if the name == "Alice", pulling out and printing that one row:


Note that a single equal sign = does assignment to a variable and not comparison. It's a common mistake to type in one equal sign for a test, when you mean two equal signs. Unfortunately, JavaScript does not flag this error, so you have to look at your code and notice it. The regular less-than/greater-than type tests: < > <= >= work as have seen before.

We'll try these code variations on the above code example

Solution code:

If logic inside the loop:

if (row.getField("name") == "Alice") {
  print(row);
}

if (row.getField("rank") == 1) {
  print(row);
}

if (row.getField("rank") < 20) {
  print(row);
}

if (row.getField("gender") == "girl") {
  print(row);
}

s.startsWith("hi") s.endsWith("y")

For our purposes, strings support a s.startsWith("Ab") function, here testing if the string in the variable s starts with the "Ab" .. true or false. Likewise, there is s.endsWith("yz"), here testing if the string in variable s has "yz" at its very end. (Sadly, these two functions are not part of standard JavaScript; I made them work just for out code because they are so useful. These two functions are common in other computer languages.)

These tests work very well with the name strings pulled out of the baby data. Here we can look at all the names beginning with "Ab".


Solution code:

If logic inside the loop:

if (row.getField("name").startsWith("Ab")) {
  print(row);
}

if (row.getField("name").endsWith("z")) {
  print(row);
}

Boolean Logic: and &&, or ||, not !

A && means "and" (two ampersands). A || means "or" (two vertical bars, possibly near the return key on your keyboard). A ! (exclamation mark) means "not". So for example the test
(x.startsWith("A") && x.endsWith("z")) is true only if x starts with A and ends with z. The test
(x.startsWith("A") || x.endsWith("z")) is true if x starts with A or x ends with z (or both). Sorry the && || ! syntax is a bit cryptic -- it's a historical syntax used in many languages.

Here's the if-test for the query: popular names beginning with A (i.e. name starts with A and rank <= 50):

  if (row.getField("name").startsWith("A") &&
      row.getField("rank") <= 50) { ...

Now add the constraint that they must be girl names:

  if (row.getField("name").startsWith("A") &&
      row.getField("rank") <= 50 &&
      row.getField("gender") == "girl") { ...

Since the test code is rather long, it's broken up to go on multiple lines. Javascript allows you to spread a test across lines like that. Notice that right after the word "if", the whole test is surrounded by one pair of parenthesis ( ... ) -- the parenthesis are required at all times JavaScript if-tests; forgetting them is a common mistake.

Below is the working code for the Popular A names test mentioned above.


Experiments to try:

Solution code:

If logic inside the loop:

if (row.getField("name").startsWith("Ab") ||
    row.getField("name").startsWith("Ac")) {
  print(row);
}

if (row.getField("name").startsWith("A") ||
    row.getField("name").startsWith("Ac") ||
    row.getField("name").startsWith("Al")) {
  print(row);
}

if (row.getField("name").startsWith("O") &&
    row.getField("name").endsWith("a")) {
  print(row);
}

if (row.getField("name").startsWith("O") &&
    row.getField("gender") == "girl") {
  print(row);
}

if (row.getField("rank") <= 10 &&
    row.getField("gender") == "girl") {
  print(row);
}

if (row.getField("name").endsWith("ia") &&
    row.getField("gender") == "boy") {
  print(row);
}

Table Query Exercises

1. Top 10

Add code to print the rows where the rank is <= 10.

table1-ex1


2. A Names

Add code to print the rows where the name starts with "A" and the rank is less than 40.

table1-ex2


3. Z Boy

Add code to print the rows where the name ends in "z" and the gender is "boy".

table1-ex3


4. A Girl (optional)

Add code to print the rows where the name ends with "a" and the gender is "girl" and the rank is less than 100. Then try it for "boy".

table1-ex4


Counting

Thus far we've used an if/print structure inside a loop to select certain rows to print or not. Now we'll use a variable with a little code to count how many times an if-test is true. Below is a loop as we have seen before that detects all the names that start with "A". After printing the rows, the code prints a summary of how many names started with "A".

The count version of the code has three additions to the previous printing version of the code:

  1. count = 0; -- Before the loop, create a count variable and set it to 0.
  2. count = count + 1 -- In the loop, this increases whatever number is stored in count by 1 when the if statement is true. This is a weird use of the equal sign = vs. how it works it mathematics. First it evaluates the right hand side. Then it assigns that value back into the count variable, so in effect it increases the number stored in count by 1. Placing a ++ after a variable is a shorthand for this.
  3. print("count:", count); -- finally, outside the loop we need some code to print the value stored in the count variable.


Counting Multiple Things

Do more boy or girl names begin with A? We want a program which by the end prints "girl count:nn" and "boy count:nn", counting whatever characteristic we are looking for. One approach is to copy/paste the loop, with the first loop counting for girls, and then the second loop counting for boys. This approach works, although it's a little wordy.


Another approach is to use two counters, one for the girl case and one for the boy case: countG and countB. Initialize both counters to 0 outside the loop. Then in the loop, have two if statments, one to manage each counter. When the loop is done, just print both counters. (solution below)

eg13-3


Class Survey

survey

As another example of a table, we have the class survey data with people's favorite movies and what have you. The survey data appears in a google spreadsheet which can be exported in csv format. This data is available in the file "lead-survey-2011.csv". This also illustrates that .csv format's role as an interchange format between systems.

The fields in this data are: Timestamp (Google surveys put this in automatically), color, milk, tv, movie, sport, book, soda. I cleaned up the data a little, regularizing the spelling, removing most punctuation, and paring down multiple answers. The convertToLowerCase() function of the table, changes all the text it contains to lower case. This simplifies our logic, so we don't have to worry if someone typed in "Blue" or "blue" .. in the table it will always be the lowercase "blue" form.

Write code to print and count the rows of people with favorite color "red". Then re-use the code to print for green blue, and black.


This work was created by Nick Parlante and is released under the Creative Commons Share-Alike license 3.0