This legacy version of AI Platform Prediction is deprecated and will no longer be available on Google Cloud after January 31, 2025. All models, associated metadata, and deployments will be deleted after January 31, 2025. Migrate your resources to Vertex AI to get new machine learning features that are unavailable in AI Platform.
When you don't need your predictions right away, or when you have a large numberof instances to get predictions for, you can use the batch prediction service.This page describes how to start AI Platform Prediction batch prediction jobs.AI Platform Prediction only supports getting batch predictions from TensorFlowmodels.
Learn aboutonline versus batch predictionor read anoverview of prediction concepts.
Before you begin
In order to request predictions, you must first:
Create a model resource and a versionresource or put a TensorFlow SavedModel ina Cloud Storage location that your project can access.
- If you choose to use a version resource for batch prediction, you mustcreate the version with the
mls1-c1-m2
machinetype.
- If you choose to use a version resource for batch prediction, you mustcreate the version with the
Set up a Cloud Storage location that your project has access to for:
Input data files. This can be multiple locations, and your project mustbe authorized to read from each.
Output files. You can only specify one output path and your project mustbe authorized to write data to it.
Verify that your input file is in thecorrect format for batch prediction.
Configuring a batch prediction job
To start your batch prediction job, you'll need to gather some configurationdata. This is the same data that is contained in thePredictionInputobject you use when calling the API directly:
- Data format
The type of input format you use for your input files. All of your inputfiles for a given job must use the same data format. Set to one of thesevalues:
- JSON
Your input files are plain text with an instance on each line. This istheformatdescribed on the prediction concepts page.
- TF_RECORD
Your input files use the TensorFlowTFRecord format.
- TF_RECORD_GZIP
Your input files are GZIP-compressed TFRecord files.
- Input paths
The URIs of your input data files, which must be in Cloud Storagelocations. You can specify:
Paths to specific files:
'gs://path/to/my/input/file.json'
.Paths to directories with a single asterisk wildcard, to indicate allfiles in that directory:
'gs://path/to/my/input/*'
.Paths to partial filenames with a single asterisk wildcard at the end,to indicate all files that start with the provided sequence:
'gs://path/to/my/input/file*'
.
You can combine multiple URIs. In Python you make a list of them. If you usethe Google Cloud CLI, or call the API directly, you can listmultiple URIs, separated by commas, but with no space in between them. Thisis the right format for the
--input-paths
flag:--input-paths gs://a/directory/of/files/*,gs://a/single/specific/file.json,gs://a/file/template/data*
- Output path
The path to the Cloud Storage location where you want the predictionservice to save your results. Your project must have permissions to writeto this location.
- Model name and version name
The name of the model and, optionally, the version you want to getpredictions from. If you don't specify a version, the model's defaultversion is used. For batch prediction, the version must use the
mls1-c1-m2
machinetype.If you provide a Model URI (see the following section), omit these fields.
- Model URI
You can get predictions from a model that isn't deployed onAI Platform Prediction by specifying the URI of the SavedModel you want touse. The SavedModel must be stored on Cloud Storage.
To summarize, you have three options for specifying the model to use forbatch prediction. You can use:
The model name by itself to use the model's default version.
The model and version names to use a specific model version.
The model URI to use a SavedModel that is on Cloud Storage, butnot deployed to AI Platform Prediction.
- Region
The Google Compute Engine region where you want to run your job.For best performance, you should run your prediction job and store yourinput and output data in the same region, especially for very largedatasets. AI Platform Prediction batch prediction is available inthe following regions:
- us-central1 - us-east1 - europe-west1 - asia-east1
To fully understand the available regions forAI Platform Prediction services, including model training and onlineprediction, read the guide to regions.
- Job name
A name for your job, which must:
- Contain only mixed-case (case sensitive) letters, digits, andunderscores.
- Start with a letter.
- Contain no more than 128 characters.
- Be unique among all training and batch prediction job names ever used inyour project. This includes all jobs that you created in your project,regardless of their success or status.
- Batch size (optional)
The number of records per batch. The service will buffer
batch_size
number of records in memory before invoking your model. Defaults to 64 if not specified.- Labels (optional)
You can add labels to your job to organize and sort jobs into categories when viewing or monitoring resources. For example, you could sort jobs by team (by adding labels like
engineering
orresearch
) or by development phase (prod
ortest
). To add labels to your prediction job, provide a list ofKEY=VALUE
pairs.- Maximum worker count (optional)
The maximum number of prediction nodes to use in the processing cluster forthis job. This is your way to put an upper limit on the automatic scalingfeature of batch prediction. If you don't specify a value, it defaultsto 10. Regardless of the value you specify, scaling is limited by theprediction node quota.
- Runtime version (optional)
The AI Platform Prediction versionto use for the job. This option is included so that you can specify aruntime version to use with models that aren't deployed onAI Platform Prediction. You should always omit this value for deployedmodel versions, which signals the service to use the same version that wasspecified when the model version was deployed.
- Signature name (optional)
If your saved model has multiple signatures, use this option to specifya custom TensorFlow signature name, which allows you to select analternative input/output map defined in the TensorFlow SavedModel.See the TensorFlow documentation onSavedModelfor a guide to using signatures, and the guide to specifying theoutputs of a custom model.The default isDEFAULT_SERVING_SIGNATURE_DEF_KEY,which has the value
serving_default
.
The following examples define variables to hold configuration data.
gcloud
It isn't necessary to create variables when using the gcloud command-linetool to start a job. However, doing so here makes the job submission commandmuch easier to enter and read.
DATA_FORMAT="text" # JSON data formatINPUT_PATHS='gs://path/to/your/input/data/*'OUTPUT_PATH='gs://your/desired/output/location'MODEL_NAME='census'VERSION_NAME='v1'REGION='us-east1'now=$(date +"%Y%m%d_%H%M%S")JOB_NAME="census_batch_predict_$now"MAX_WORKER_COUNT="20"BATCH_SIZE="32"LABELS="team=engineering,phase=test,owner=sara"
Python
When you use the Google API Client Library for Python, you can use Pythondictionaries to represent theJob andPredictionInputresources.
Format your project name and your model or version name with the syntax usedby the AI Platform Prediction REST APIs:
- project_name -> 'projects/project_name'
- model_name -> 'projects/project_name/models/model_name'
- version_name -> 'projects/project_name/models/model_name/versions/version_name'
Create a dictionary for the Job resource and populate it with two items:
A key named
'jobId'
with the job name you want to use as its value.A key named
'predictionInput'
that contains another dictionary objecthousing all of the required members ofPredictionInput,and any optional members that you want to use.
The following example shows a function that takes the configurationinformation as input variables and returns the prediction request body.In addition to the basics, the example also generates a unique jobidentifier based on your project name, model name, and the current time.
import timeimport redef make_batch_job_body(project_name, input_paths, output_path, model_name, region, data_format='JSON', version_name=None, max_worker_count=None, runtime_version=None): project_id = 'projects/{}'.format(project_name) model_id = '{}/models/{}'.format(project_id, model_name) if version_name: version_id = '{}/versions/{}'.format(model_id, version_name) # Make a jobName of the format "model_name_batch_predict_YYYYMMDD_HHMMSS" timestamp = time.strftime('%Y%m%d_%H%M%S', time.gmtime()) # Make sure the project name is formatted correctly to work as the basis # of a valid job name. clean_project_name = re.sub(r'\W+', '_', project_name) job_id = '{}_{}_{}'.format(clean_project_name, model_name, timestamp) # Start building the request dictionary with required information. body = {'jobId': job_id, 'predictionInput': { 'dataFormat': data_format, 'inputPaths': input_paths, 'outputPath': output_path, 'region': region}} # Use the version if present, the model (its default version) if not. if version_name: body['predictionInput']['versionName'] = version_id else: body['predictionInput']['modelName'] = model_id # Only include a maximum number of workers or a runtime version if specified. # Otherwise let the service use its defaults. if max_worker_count: body['predictionInput']['maxWorkerCount'] = max_worker_count if runtime_version: body['predictionInput']['runtimeVersion'] = runtime_version return body
Submitting a batch prediction job
Submitting your job is a simple call toprojects.jobs.createor its command-line tool equivalent,gcloud ai-platform jobs submit prediction.
gcloud
The following example uses the variables defined in the previous section tostart batch prediction.
gcloud ai-platform jobs submit prediction $JOB_NAME \ --model $MODEL_NAME \ --input-paths $INPUT_PATHS \ --output-path $OUTPUT_PATH \ --region $REGION \ --data-format $DATA_FORMAT
Python
Starting a batch prediction job with the Google API Client Library for Pythonfollows a similar pattern to other client SDK procedures:
Prepare the request body to use for the call (this is shown in theprevious section).
Form the request by calling ml.projects.jobs.create.
Call execute on the request to get a response, making sure to check forHTTP errors.
Use the response as a dictionary to get values from the Job resource.
You can use theGoogle API Client Library for Pythonto call the AI Platform Training and Prediction API without manually constructing HTTPrequests. Before you run the following code sample, you must set upauthentication.
How to set up authentication
To set up authentication, you need to create a service account key and set an environment variable for the file path to the service account key.
-
Create a service account:
-
In the Google Cloud console, go to the Create service account page.
- In the Service account name field, enter a name.
- Optional: In the Service account description field, enter a description.
- Click Create.
- Click the Select a role field. Under All roles, select AI Platform > AI Platform Admin.
- Click Add another role.
-
Click the Select a role field. Under All roles, select Storage > Storage Object Admin.
-
Click Done to create the service account.
Do not close your browser window. You will use it in the next step.
-
-
Create a service account key for authentication:
- In the Google Cloud console, click the email address for the service account that you created.
- Click Keys.
- Click Add key, then Create new key.
- Click Create. A JSON key file is downloaded to your computer.
- Click Close.
-
Set the environment variable GOOGLE_APPLICATION_CREDENTIALS to the file path of the JSON file that contains your service account key. This variable only applies to your current shell session, so if you open a new session, set the variable again.
Example: Linux or macOS
Replace [PATH] with the file path of the JSON file that contains your service account key.
export GOOGLE_APPLICATION_CREDENTIALS="[PATH]"
For example:
export GOOGLE_APPLICATION_CREDENTIALS="/home/user/Downloads/service-account-file.json"
Example: Windows
Replace [PATH] with the file path of the JSON file that contains your service account key, and [FILE_NAME] with the filename.
With PowerShell:
$env:GOOGLE_APPLICATION_CREDENTIALS="[PATH]"
For example:
$env:GOOGLE_APPLICATION_CREDENTIALS="C:\Users\username\Downloads\[FILE_NAME].json"
With command prompt:
set GOOGLE_APPLICATION_CREDENTIALS=[PATH]
import googleapiclient.discovery as discovery project_id = 'projects/{}'.format(project_name) ml = discovery.build('ml', 'v1') request = ml.projects().jobs().create(parent=project_id, body=batch_predict_body) try: response = request.execute() print('Job requested.') # The state returned will almost always be QUEUED. print('state : {}'.format(response['state'])) except errors.HttpError as err: # Something went wrong, print out some information. print('There was an error getting the prediction results.' + 'Check the details:') print(err._get_reason())
Monitoring your batch prediction job
A batch prediction job can take a long time to finish. You can monitor yourjob's progress using Google Cloud console:
Go to the AI Platform Prediction Jobs page in the Google Cloud console:
Click on your job's name in the Job ID list. This opens theJob details page.
The current status is shown with the job name at the top of the page.
If you want more details, you can click View logs to see your job'sentry in Cloud Logging.
There are other ways to track the progress of your batch prediction job. Theyfollow the same patterns as monitoring training jobs. You'll find moreinformation on the page describinghow to monitor your training jobs.You may need to adjust the instructions there slightly to work with predictionjobs, but the mechanisms are the same.
Getting prediction results
The service writes predictions to the Cloud Storage location you specify.There are two types of files output that might include interesting results:
Files named
prediction.errors_stats-NNNNN-of-NNNNN
contain informationabout any problems encountered during the job.JSON Lines files named
prediction.results-NNNNN-of-NNNNN
contain the predictionsthemselves, as defined by your model's output.
The filenames include index numbers (shown above as an 'N' for each digit) thatcapture how many file in total you should find. For example a job that has sixresults files includes prediction.results-00000-of-00006
throughprediction.results-00005-of-00006
.
Every line of each prediction file is a JSON object representing a singleprediction result. You can open the prediction files with your choice of texteditor. For a quick look on the command line youcan use gsutil cat
:
gsutil cat $OUTPUT_PATH/prediction.results-NNNNN-of-NNNNN|less
Remember that your prediction results are not typically output in the same orderas your input instances, even if you use only a single input file. You can findthe prediction for an instance by matching theinstance keys.
What's Next
- Use online prediction.
- Get more details about the prediction process.
- Troubleshoot problems that arisewhen you request online predictions.
- Learn about using labels to organize yourjobs.
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2024-01-12 UTC.
[{ "type": "thumb-down", "id": "hardToUnderstand", "label":"Hard to understand" },{ "type": "thumb-down", "id": "incorrectInformationOrSampleCode", "label":"Incorrect information or sample code" },{ "type": "thumb-down", "id": "missingTheInformationSamplesINeed", "label":"Missing the information/samples I need" },{ "type": "thumb-down", "id": "otherDown", "label":"Other" }] [{ "type": "thumb-up", "id": "easyToUnderstand", "label":"Easy to understand" },{ "type": "thumb-up", "id": "solvedMyProblem", "label":"Solved my problem" },{ "type": "thumb-up", "id": "otherUp", "label":"Other" }]