Friday, March 08, 2019

Making Projects More Interesting With More Data

My students are working on the palindrome project. You probably know it – enter a string and report if it is a palindrome or not. In my case, I have them create a method to “clean” the string by stripping out all of the characters that are not letters and a second method to reverse the string. My goal is practice in creating methods as well as string handling and loops.

It’s always more interesting if the test data is more interesting. There are lots of common palindromes that students are familiar with such as “Madam I’m Adam” and “Race Car” and the ever popular “Mom”, “Dad”, and “Bob.” My favorite is “A man, a Plan, a Canal – Panama” because you can read it dramatically. This week I stumbled on a web site dedicated to palindromes - http://www.palindromelist.net/ There are probably other palindrome related web sites as well.

I’m thinking about supplying a file with lines that are and are not palindromes and having students modify this first program to read the file and report.  More data is better data.

Another idea I have percolating is asking students to write code that creates palindromes. I’m not sure how hard this would be as I’m still thinking about how I would do it. Again, data is important. You want to use real words and ideally the phrase should make sense. Does it have to though? Hum.

I have a dictionary file – a text file with almost 114,000 words in it. I figure that should be useful for something. I keep thinking it would be useful for a lot of interesting projects – things like word games (Boggle, Scrabble, etc.)  or maybe spell checkers.

There are sources of large text files on the internet as well. I have files with the full text of some books (in the public domain of course) and Shakespeare's sonnets. Check out Project Gutenberg which has some 58,000 public domain books

I’m thinking some interesting word and letter count projects are a natural. A lot of the projects that have been presented at the SIGCSE Nifty Assignments session (collected here at http://nifty.stanford.edu/) involve working with data sets of words or text. You all know about the Nifty Projects resource right?

You can also make interesting data. For a long time I have assigned a project that creates driver’s licenses based on name and birthday. New Hampshire recently moved away from this scheme for privacy reasons but it was/is a fun project. I grabbed first and last name data from the Census Bureau (there are other lists) and wrote a program to create a data file of random names and birthdays. I’m toying with assigning something like that as an assignment some day. Making sure the dates exist (no February 30th for example) makes it fun. OK, harder, but harder is fun right?

What sorts/sources of data do you use to make projects more interesting?

4 comments:

Garth said...

I have a text file of Shakespeare's insults. Three words to a line separated by commas. About 40 lines. Lots of fun to be had with this. I also use the Shakespeare sonnets file. Another useful one is a text file of about 10,000 words, one word per line.

Alfred Thompson said...

I also do the Shakespeare Insults. Kids love it.

Garth said...

Can we consider that cross-curricular?

Nithya said...

Palindromes are always a hit with kids. Number palindromes are easier to develop when compared to words. Input the string or number, reverse the same and compare both. We have only use word-buff but the link you shared has a huge list, thanks for sharing, it was very useful. Hope your students enjoyed the assignment.