I am working on a web app using GAELYK on Google App Engine and needed to reset a counter on every entity of a specific type every night. This is a perfect job for Cron and the MapReduce API. I coded the mapper servlet and was able to start the job from the MapReduce dashboard. However, I was unable start the job using Cron because the MapReduce dashboard uses an AJAX POST call to start it, and the way to start a MapReduce job programmatically also requires a POST call and Cron can only invoke an action using GET. This meant I would have to manually run the job from the MapReduce dashboard everyday which is hardly useful.
Luckily, I found a work around that I hope will be useful to someone else. I modified the mapreduce-appengine to allow MapReduce jobs to be run with a GET call and should work for anybody running Java on App Engine. You can grab the source code to compile it yourself or just download my modified version of appengine-mapper.jar and add it to your project.
In mapreduce.xml the job is defined as normal, in this case I have a Mapper called PeriodResetMapper that performs a task on every Account Entity
To call this mapper from Cron you must call the /mapreduce/command/start_job url with some parameters. I have a simple mapper that only needs the name of the Mapper and the type of Entity to iterate over. Replace your mapper name as defined in mapreduce.xml, in this case periodReset, with your own. Also, replace the entity name with the type of entity in your data store you want to iterate over, in this case Account. Also, keep in mind that you must escape the ampersands betweens your URL parameters in your cron.xml. In the example cron.xml below, my mapper will be called every day at midnight.
I highly suggest that you set a security constraint in your web.xml file to restrict the MapReduce servlet endpoint to only those with admin access (this includes Cron). You can grab all of the files in a Gist here.