(A run utilizing 8320 processors on Amazon. See below for a bash script to do this.)
Starting with the basics... Well I said I would start with the trivial. For readability:
import os def domino_run_id(): try: return os.environ['DOMINO_RUN_ID'] except: return None def running_on_domino(): return ( domino_run_id() is not None ) def running_on_local(): return not running_on_domino()Domino environment variables While we are at it:
current_project = os.environ['DOMINO_PROJECT_NAME'] current_project_owner = os.environ['DOMINO_PROJECT_OWNER'] current_run_id = os.environ['DOMINO_RUN_ID'] current_run_number = os.environ['DOMINO_RUN_NUMBER'] domino_project_path = '{0}/{1}'.format(current_project_owner, current_project)
Shelling out
Due to vagaries I don't fully understand but are undoubtedly related to security, permissions on files might not be what you expect. To work around this you might go so far as to set the permission right before you need it:
subprocess.call(["chmod", "+x", "stockfish.sh"]) subprocess.call(["chmod", "+x", "stockfish_"+binary]) cmd = ' '.join( ['./stockfish.sh' ,fen, str(seconds) , binary, str(threads), str(memory) ] ) print cmd subprocess.call( cmd, shell=True )Now unfortunately if your code is not in the same project as your data (which I recommend, see below) this won't work, at least if run from the data project. One can make a temporary copy of the script you are shelling out to and also call that just before shelling out.
interface_dir="${project}/lib/bash/interface" interface_mirror="${project}/data_project/bash_mirror/interface" mkdir -p "${project}/data_project/bash_mirror" mkdir -p "${project}/data_project/bash_mirror/interface" cp -R ${interface_dir}/* ${interface_mirror}Yeah, not the most elegant but it works. There is probably a better way. One of the domino engineers suggested adding sh in front of commands (see this note) but whatever you try, be aware that in Domino, file permissions set on a run in one project will not be preserved when you import that project into another.
Setting paths in bash
For the times when you don't want to rely on Domino variables explicitly
USER=$(whoami) if [[ $OSTYPE == *"darwin" ]] # Your machine -ne AWS hopefully :) then # We're local default_project="/Users/projects/my_project" # Hardwired default_size=small then # We're on AWS default_project="/mnt/${USER}/my_project" default_size=full fi # allow override project=${1:-${default_project}} sz=${2:-${default_sz}} # Then do something...Consistent path names Incidentally a project importing other projects can use full paths such as
/mnt/${USER}/my_project/etcbut a project that does not share any other projects cannot. To avoid inconsistency, just make sure every project imports one other project, even if it is a dummy project.
Drop-in multiple endpoint functions
Domino allows only one endpoint function per project. To allow the client to call any functions you care to drop in your endpoint file instead (by their names), include these four lines of code at the top of the same and register dispatcher as the official endpoint function.
def dispatcher( func, *args ): """ As Domino allows one endpoint per project """ module = sys.modules[ __name__ ] endpoint_func = getattr( module, func ) return endpoint_func( *args )The price you pay is specifying the function you really want as the first parameter in any call. This is not recommended for national security applications.
Safe sync
Very, very occasionally a failed sync can leave client and server in a state where it is inconvenient or confusing to revert to a previous code version. If you're a nervous Nellie like me there is a simple way to reduce the chance of any code confusion:
- Copy your source code to a second domino project
- Sync both with two separate domino syncs
- Chmod your backup/sync script so it is executable
- Rename it with .app extension, so that you can then drag it into the mac dock
- Rename back to original .sh extension
- In finder, right click on "Get Info"and under "Open with" menu, select the terminal application
Recovery
To use the server's state use domino reset. To use the local state use domino restore. More on recovery of larger projects below.
Bash script to run job on Domino and wait for completion
The script call.sh uses the Domino API to start a run and then wait for it to complete.
# Run a job on Domino and wait for it to finish # # Usage # call.sh# call.sh /mnt/USER/MYPROJECT/myscript.sh my_arg1 my_arg2 cmd=${1} arg1=${2} arg2=${3} # Send request to start job to domino temporary_response_file="response_.txt" curl -X POST \ https://api.dominodatalab.com/v1/projects/USER/PROJECT/runs \ -H 'X-Domino-Api-Key: 7DunTrump1BcouldCnTZVhitcaKBarnQDooRwithAWrifle' \ -H "Content-Type: application/json" \ -d '{"command": ["'"${cmd}"'", "'"${arg1}"'", "'"${arg2}"'"], "isDirect": false}' > ${temporary_response_file} echo "Sent command to start jobs with arg1 ${arg1} and arg2 ${arg2}." runId_quoted=$(grep -oE '"runId":"(.*)",' ${temporary_response_file} | cut -d: -f2) runId=${runId_quoted:1:${#runId_quoted}-3} echo "The runId is ${runId}" rm ${temporary_response_file} # Now poll until done while true; do sleep 60s echo "Polling ..." response=$(curl https://api.dominodatalab.com/v1/projects/USER/PROJECT/runs/$runId \ -H 'X-Domino-Api-Key: 7DuVOtdgeforzhimFanDitYouWEFfault' | grep -oE '"isCompleted":.*') if [[ ${response} == *true* ]] then echo "Job $runId has finished" break fi done
Aside: passing bash variables in curl requests with "'"${}"'"
Incidentally, am I the only one who found this slightly troublesome? Here's ten minutes of my life I'm donating to you:
key=$1 curl -X POST \ https://api.dominodatalab.com/v1/projects/YOUR_USER/YOUR_PROJECT/runs \ -H 'X-Domino-Api-Key: QAYeqRTrump2is7aLdangerousMVUSClownVMSF' \ -H "Content-Type: application/json" \ -d '{"command": ["something.sh", "'"${key}"'"], "isDirect": false}'Note the Double-Single-Double quoting of the bash variable.
Lazy man's map
Want to use more than one machine at once? I discovered that the easiest way is have each job write to the main branch. Domino sync will take care of the syncing that way, and you need only filter each run by some key (say {A..Z}).
!/usr/bin/env bash # Start multiple jobs on Amazon USER=$(whoami) project="/Users/${USER}/project" sz=${1:-full} # Just an example of a parameter passed to all jobs # Pre-requisite is a file with one line containing space separated keys which break up the jobs - see the bach hack above ordering_file="${project}/config/letter_ordering.txt" read -a kys <<<$(head -n 1 ${ordering_file}) for ky in ${kys[@]} # or just {A..Z} if you don't care about ordering do sleep 10s # Give AWS a little time so jobs get ordered the way you expect curl -X POST \ https://api.dominodatalab.com/v1/projects/YOUR_USER/YOUR_PROJECT/runs \ -H 'X-Domino-Api-Key: 7DunjmdTrumph1BwillTZdeeKstoyaVQtheWVnCountry3m4F' \ -H "Content-Type: application/json" \ -d '{"command": ["YOUR_COMMAND.sh", "/mnt/YOUR_USER", "'"${sz}"'", "'"${ky}"'"], "isDirect": false}' echo "Set command to start jobs with source_filter ${ky}" done
Ordering your jobs so large input files go first
The bash script ordering.sh provides an easy way to divide up your input data by letter {A..Z} say and order them by size. Hack as you see fit.
#!/usr/bin/env bash # Create a file which sorts data sizes by letter, so we can order the // jobs sensibly data_dir="/whereyouputbigdatainputfiles" config_dir="/Users/YOUR_USER/project/my_project/config" tmp_file="data_density.txt" ordering_file="letter_ordering.txt" # Delete the old statistics file if it exists if [[ -e ${config_dir}/${tmp_file} ]] then rm ${config_dir}/${tmp_file} fi # Create a file with one line per letter: # Size key # 1223411 A # 1231233 B # for x in {A..Z} do data_sz=$(du -c ${data_dir}/${x}* | awk '/./{line=$0} END{print $1}') echo "${data_sz} ${x}" >> ${config_dir}/${tmp_file} done # Sort, extract the letters, and convert to a single row sort -r -t " " -k 1 -g ${config_dir}/${tmp_file} | awk '{print $2}' | tr '\n' ' ' > ${config_dir}/${ordering_file} rm ${config_dir}/${tmp_file} # Now you can easily read the ordered keys into a bash array: # read -a ordering <<<$(head -n 1 ${config_dir}/${ordering_file})Adjust as you see fit. I use this in conjunction with the previous hack to reduce wall clock time of big jobs.
Lazy man's map-reduce
You'll often want to wait for the results to come back so you can get on with dependent tasks. I've posted a polling version of the task launcher as well. To poll directly from bash you can do something like this:
finished_runs="" while true; do sleep 1m status="finished" for runId in ${runIds[@]} do if [[ ${finished_runs} == *"$runId"* ]] then : # No need to check again after it has finished else response=$(curl https://api.dominodatalab.com/v1/projects/YOUR_USER/YOUR_PROJECT/runs/$runId \ -H 'X-Domino-Api-Key: 7DunjmIbhopeWKTrumVpgDieszAFterrWXibleWQEdeath' | grep -oE '"isCompleted":.*') if [[ ${response} == *false* ]] then status="running" elif [[ ${response} == *true* ]] then echo "Job $runId has finished" finished_runs="$finished_runs $runId" else echo "We have a problem - did not expect this" fi fi done if [[ ${status} == "finished" ]] then echo "All finished" break fi done # Now do the reduce step because you have all the data readyThanks Jonathan Schaller for helping me fix this.
Pull runId out of JSON response
As a minor digression ... there are better general purpose JSON parsers like jq, but this is good enough. Send the curl response to ${response_file} and then:
runId_quoted=$(grep -oE '"runId":"(.*)",' ${response_file} | cut -d: -f2) runId=${runId_quoted:1:${#runId_quoted}-3}Grant one project write access to another Project A imports project B's files. So A can easily read B's files. However, A cannot write to B. The way I hack around this is to share A's files with B, and then in project B create a script to do the copying of A's files over.
key=$1 USER=$(whoami) echo "Chilling for a minute so Project A can finish syncing" sleep 1m cp -r /mnt/${USER}/Project_A/data/results_${key}* /mnt/${USER}/Project_B/data && echo "Success" ls -l /mnt/${USER}/Project_B/data/results_${key}*Let's suppose the above sits in Project B with the name do_the_copy.sh. In order to drive this from Project A I write a little "beam me up Scotty" script:
#!/usr/bin/env bash key=$1 curl -X POST \ https://api.dominodatalab.com/v1/projects/$USER/mapreduce/runs \ -H 'X-Domino-Api-Key: QqTYrumPoisAlmp2NutterisWEVtruly' \ -H "Content-Type: application/json" \ -d '{"command": ["do_the_copy.sh", "'"${key}"'"], "isDirect": false}'This uses the Domino API to kick start the process. This is a kludge and only really works at the end of a run because Project A's files (assuming they have changed) can't be seen by Project B until after the sync occurs. A more sophisticated approach (which also solves other issues) is to wrap the domino task, for example in a luigi task as in this example by, you guessed, Jonathan Schaller. As noted above, you don't necessarily need an explicit collection step if you write results to the main branch.
Domino API - job status
Checking on a job...
domino_api = Domino(domino_project_path) def get_run_info(run_id): for run_info in domino_api.runs_list()['data']: if run_info['id'] == run_id: return run_infoThanks Jonathan.
Domino API - data status
def get_blob_key(commit_id, filepath): dir, filename = os.path.split(filepath) files = domino_api.files_list(commit_id, path=dir)['data'] for file in files: if file['path'] == filepath: return file['key'] def file_exists_in_commit(filepath, commit_id): # filepath is relative to project root files_list = domino_api.files_list(commit_id, path='/')['data'] for file in files_list: if filepath == file['path']: return True return FalseThanks again.
Rich man's map-reduce
As noted above, another way to handle large pipelines is to mixin a Domino task into some pipeline framework, such as Luigi. See Jonathan's example at here
Ignore it and it will go away
I assume the reader is familiar with the usage of .dominoignore. But a classic catch-22 arises when you forget to ignore files. Too many are generated and syncing is a pain - including syncing of the new .dominoignore file that would get you out of the predicament. Sometimes I have had difficulty editing the remote version of .dominoignore through the web browser - which is the obvious solution - and that makes it really difficult to wiggle out of the space issue. So I just create a launcher for it.
#!/usr/bin/env bash ignore_file=$1 echo $1 >> /mnt/USER/PROJECT/.dominoignoreTrivial but exceedingly useful.
Maintaining a high level of motivation
Just had a nine hour experiment crash at the final stage? Remember that we all face challenges, and none more difficult than those solved by MacGyver.py