Getting batch predictions  |  AI Platform Prediction  |  Google Cloud (2024)

This legacy version of AI Platform Prediction is deprecated and will no longer be available on Google Cloud after January 31, 2025. All models, associated metadata, and deployments will be deleted after January 31, 2025. Migrate your resources to Vertex AI to get new machine learning features that are unavailable in AI Platform.

Stay organized with collections Save and categorize content based on your preferences.

When you don't need your predictions right away, or when you have a large numberof instances to get predictions for, you can use the batch prediction service.This page describes how to start AI Platform Prediction batch prediction jobs.AI Platform Prediction only supports getting batch predictions from TensorFlowmodels.

Learn aboutonline versus batch predictionor read anoverview of prediction concepts.

Before you begin

In order to request predictions, you must first:

  • Create a model resource and a versionresource or put a TensorFlow SavedModel ina Cloud Storage location that your project can access.

    • If you choose to use a version resource for batch prediction, you mustcreate the version with the mls1-c1-m2 machinetype.
  • Set up a Cloud Storage location that your project has access to for:

    • Input data files. This can be multiple locations, and your project mustbe authorized to read from each.

    • Output files. You can only specify one output path and your project mustbe authorized to write data to it.

  • Verify that your input file is in thecorrect format for batch prediction.

Configuring a batch prediction job

To start your batch prediction job, you'll need to gather some configurationdata. This is the same data that is contained in thePredictionInputobject you use when calling the API directly:

Data format

The type of input format you use for your input files. All of your inputfiles for a given job must use the same data format. Set to one of thesevalues:

JSON

Your input files are plain text with an instance on each line. This istheformatdescribed on the prediction concepts page.

TF_RECORD

Your input files use the TensorFlowTFRecord format.

TF_RECORD_GZIP

Your input files are GZIP-compressed TFRecord files.

Input paths

The URIs of your input data files, which must be in Cloud Storagelocations. You can specify:

  • Paths to specific files: 'gs://path/to/my/input/file.json'.

  • Paths to directories with a single asterisk wildcard, to indicate allfiles in that directory: 'gs://path/to/my/input/*'.

  • Paths to partial filenames with a single asterisk wildcard at the end,to indicate all files that start with the provided sequence:'gs://path/to/my/input/file*'.

You can combine multiple URIs. In Python you make a list of them. If you usethe Google Cloud CLI, or call the API directly, you can listmultiple URIs, separated by commas, but with no space in between them. Thisis the right format for the --input-paths flag:

 --input-paths gs://a/directory/of/files/*,gs://a/single/specific/file.json,gs://a/file/template/data*
Output path

The path to the Cloud Storage location where you want the predictionservice to save your results. Your project must have permissions to writeto this location.

Model name and version name

The name of the model and, optionally, the version you want to getpredictions from. If you don't specify a version, the model's defaultversion is used. For batch prediction, the version must use themls1-c1-m2 machinetype.

If you provide a Model URI (see the following section), omit these fields.

Model URI

You can get predictions from a model that isn't deployed onAI Platform Prediction by specifying the URI of the SavedModel you want touse. The SavedModel must be stored on Cloud Storage.

To summarize, you have three options for specifying the model to use forbatch prediction. You can use:

  • The model name by itself to use the model's default version.

  • The model and version names to use a specific model version.

  • The model URI to use a SavedModel that is on Cloud Storage, butnot deployed to AI Platform Prediction.

Region

The Google Compute Engine region where you want to run your job.For best performance, you should run your prediction job and store yourinput and output data in the same region, especially for very largedatasets. AI Platform Prediction batch prediction is available inthe following regions:

 - us-central1 - us-east1 - europe-west1 - asia-east1

To fully understand the available regions forAI Platform Prediction services, including model training and onlineprediction, read the guide to regions.

Job name

A name for your job, which must:

  • Contain only mixed-case (case sensitive) letters, digits, andunderscores.
  • Start with a letter.
  • Contain no more than 128 characters.
  • Be unique among all training and batch prediction job names ever used inyour project. This includes all jobs that you created in your project,regardless of their success or status.
Batch size (optional)

The number of records per batch. The service will buffer batch_size number of records in memory before invoking your model. Defaults to 64 if not specified.

Labels (optional)

You can add labels to your job to organize and sort jobs into categories when viewing or monitoring resources. For example, you could sort jobs by team (by adding labels like engineering or research) or by development phase (prod or test). To add labels to your prediction job, provide a list of KEY=VALUE pairs.

Maximum worker count (optional)

The maximum number of prediction nodes to use in the processing cluster forthis job. This is your way to put an upper limit on the automatic scalingfeature of batch prediction. If you don't specify a value, it defaultsto 10. Regardless of the value you specify, scaling is limited by theprediction node quota.

Runtime version (optional)

The AI Platform Prediction versionto use for the job. This option is included so that you can specify aruntime version to use with models that aren't deployed onAI Platform Prediction. You should always omit this value for deployedmodel versions, which signals the service to use the same version that wasspecified when the model version was deployed.

Signature name (optional)

If your saved model has multiple signatures, use this option to specifya custom TensorFlow signature name, which allows you to select analternative input/output map defined in the TensorFlow SavedModel.See the TensorFlow documentation onSavedModelfor a guide to using signatures, and the guide to specifying theoutputs of a custom model.The default isDEFAULT_SERVING_SIGNATURE_DEF_KEY,which has the value serving_default.

The following examples define variables to hold configuration data.

gcloud

It isn't necessary to create variables when using the gcloud command-linetool to start a job. However, doing so here makes the job submission commandmuch easier to enter and read.

DATA_FORMAT="text" # JSON data formatINPUT_PATHS='gs://path/to/your/input/data/*'OUTPUT_PATH='gs://your/desired/output/location'MODEL_NAME='census'VERSION_NAME='v1'REGION='us-east1'now=$(date +"%Y%m%d_%H%M%S")JOB_NAME="census_batch_predict_$now"MAX_WORKER_COUNT="20"BATCH_SIZE="32"LABELS="team=engineering,phase=test,owner=sara"

Python

When you use the Google API Client Library for Python, you can use Pythondictionaries to represent theJob andPredictionInputresources.

  1. Format your project name and your model or version name with the syntax usedby the AI Platform Prediction REST APIs:

    • project_name -> 'projects/project_name'
    • model_name -> 'projects/project_name/models/model_name'
    • version_name -> 'projects/project_name/models/model_name/versions/version_name'
  2. Create a dictionary for the Job resource and populate it with two items:

    • A key named 'jobId' with the job name you want to use as its value.

    • A key named 'predictionInput' that contains another dictionary objecthousing all of the required members ofPredictionInput,and any optional members that you want to use.

    The following example shows a function that takes the configurationinformation as input variables and returns the prediction request body.In addition to the basics, the example also generates a unique jobidentifier based on your project name, model name, and the current time.

    import timeimport redef make_batch_job_body(project_name, input_paths, output_path, model_name, region, data_format='JSON', version_name=None, max_worker_count=None, runtime_version=None): project_id = 'projects/{}'.format(project_name) model_id = '{}/models/{}'.format(project_id, model_name) if version_name: version_id = '{}/versions/{}'.format(model_id, version_name) # Make a jobName of the format "model_name_batch_predict_YYYYMMDD_HHMMSS" timestamp = time.strftime('%Y%m%d_%H%M%S', time.gmtime()) # Make sure the project name is formatted correctly to work as the basis # of a valid job name. clean_project_name = re.sub(r'\W+', '_', project_name) job_id = '{}_{}_{}'.format(clean_project_name, model_name, timestamp) # Start building the request dictionary with required information. body = {'jobId': job_id, 'predictionInput': { 'dataFormat': data_format, 'inputPaths': input_paths, 'outputPath': output_path, 'region': region}} # Use the version if present, the model (its default version) if not. if version_name: body['predictionInput']['versionName'] = version_id else: body['predictionInput']['modelName'] = model_id # Only include a maximum number of workers or a runtime version if specified. # Otherwise let the service use its defaults. if max_worker_count: body['predictionInput']['maxWorkerCount'] = max_worker_count if runtime_version: body['predictionInput']['runtimeVersion'] = runtime_version return body

Submitting a batch prediction job

Submitting your job is a simple call toprojects.jobs.createor its command-line tool equivalent,gcloud ai-platform jobs submit prediction.

gcloud

The following example uses the variables defined in the previous section tostart batch prediction.

gcloud ai-platform jobs submit prediction $JOB_NAME \ --model $MODEL_NAME \ --input-paths $INPUT_PATHS \ --output-path $OUTPUT_PATH \ --region $REGION \ --data-format $DATA_FORMAT

Python

Starting a batch prediction job with the Google API Client Library for Pythonfollows a similar pattern to other client SDK procedures:

  1. Prepare the request body to use for the call (this is shown in theprevious section).

  2. Form the request by calling ml.projects.jobs.create.

  3. Call execute on the request to get a response, making sure to check forHTTP errors.

  4. Use the response as a dictionary to get values from the Job resource.

You can use theGoogle API Client Library for Pythonto call the AI Platform Training and Prediction API without manually constructing HTTPrequests. Before you run the following code sample, you must set upauthentication.

How to set up authentication

To set up authentication, you need to create a service account key and set an environment variable for the file path to the service account key.

  1. Create a service account:

    1. In the Google Cloud console, go to the Create service account page.

      Go to Create service account

    2. In the Service account name field, enter a name.
    3. Optional: In the Service account description field, enter a description.
    4. Click Create.
    5. Click the Select a role field. Under All roles, select AI Platform > AI Platform Admin.
    6. Click Add another role.
    7. Click the Select a role field. Under All roles, select Storage > Storage Object Admin.

    8. Click Done to create the service account.

      Do not close your browser window. You will use it in the next step.

  2. Create a service account key for authentication:

    1. In the Google Cloud console, click the email address for the service account that you created.
    2. Click Keys.
    3. Click Add key, then Create new key.
    4. Click Create. A JSON key file is downloaded to your computer.
    5. Click Close.
  3. Set the environment variable GOOGLE_APPLICATION_CREDENTIALS to the file path of the JSON file that contains your service account key. This variable only applies to your current shell session, so if you open a new session, set the variable again.

    Example: Linux or macOS

    Replace [PATH] with the file path of the JSON file that contains your service account key.

    export GOOGLE_APPLICATION_CREDENTIALS="[PATH]"

    For example:

    export GOOGLE_APPLICATION_CREDENTIALS="/home/user/Downloads/service-account-file.json"

    Example: Windows

    Replace [PATH] with the file path of the JSON file that contains your service account key, and [FILE_NAME] with the filename.

    With PowerShell:

    $env:GOOGLE_APPLICATION_CREDENTIALS="[PATH]"

    For example:

    $env:GOOGLE_APPLICATION_CREDENTIALS="C:\Users\username\Downloads\[FILE_NAME].json"

    With command prompt:

    set GOOGLE_APPLICATION_CREDENTIALS=[PATH]
 import googleapiclient.discovery as discovery project_id = 'projects/{}'.format(project_name) ml = discovery.build('ml', 'v1') request = ml.projects().jobs().create(parent=project_id, body=batch_predict_body) try: response = request.execute() print('Job requested.') # The state returned will almost always be QUEUED. print('state : {}'.format(response['state'])) except errors.HttpError as err: # Something went wrong, print out some information. print('There was an error getting the prediction results.' + 'Check the details:') print(err._get_reason())

Monitoring your batch prediction job

A batch prediction job can take a long time to finish. You can monitor yourjob's progress using Google Cloud console:

  1. Go to the AI Platform Prediction Jobs page in the Google Cloud console:

    Go to the Google Cloud console Jobs page

  2. Click on your job's name in the Job ID list. This opens theJob details page.

  3. The current status is shown with the job name at the top of the page.

  4. If you want more details, you can click View logs to see your job'sentry in Cloud Logging.

There are other ways to track the progress of your batch prediction job. Theyfollow the same patterns as monitoring training jobs. You'll find moreinformation on the page describinghow to monitor your training jobs.You may need to adjust the instructions there slightly to work with predictionjobs, but the mechanisms are the same.

Getting prediction results

The service writes predictions to the Cloud Storage location you specify.There are two types of files output that might include interesting results:

  • Files named prediction.errors_stats-NNNNN-of-NNNNN contain informationabout any problems encountered during the job.

  • JSON Lines files namedprediction.results-NNNNN-of-NNNNN contain the predictionsthemselves, as defined by your model's output.

The filenames include index numbers (shown above as an 'N' for each digit) thatcapture how many file in total you should find. For example a job that has sixresults files includes prediction.results-00000-of-00006 throughprediction.results-00005-of-00006.

Every line of each prediction file is a JSON object representing a singleprediction result. You can open the prediction files with your choice of texteditor. For a quick look on the command line youcan use gsutil cat:

gsutil cat $OUTPUT_PATH/prediction.results-NNNNN-of-NNNNN|less

Remember that your prediction results are not typically output in the same orderas your input instances, even if you use only a single input file. You can findthe prediction for an instance by matching theinstance keys.

What's Next

Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2024-01-12 UTC.

Getting batch predictions  |  AI Platform Prediction  |  Google Cloud (2024)
Top Articles
Latest Posts
Article information

Author: Arielle Torp

Last Updated:

Views: 6094

Rating: 4 / 5 (61 voted)

Reviews: 92% of readers found this page helpful

Author information

Name: Arielle Torp

Birthday: 1997-09-20

Address: 87313 Erdman Vista, North Dustinborough, WA 37563

Phone: +97216742823598

Job: Central Technology Officer

Hobby: Taekwondo, Macrame, Foreign language learning, Kite flying, Cooking, Skiing, Computer programming

Introduction: My name is Arielle Torp, I am a comfortable, kind, zealous, lovely, jolly, colorful, adventurous person who loves writing and wants to share my knowledge and understanding with you.