CasJobs Guide

Kepler CasJobs Guide

Introduction

Welcome to the Catalog Archive Server Jobs System, or CasJobs. This guide assumes you have experience with basic SQL syntax. If you'd like to learn more about SQL there are plenty of other tutorials on the web. This guide will cover the slightly simpler varient of SQL that CasJobs uses.

Registration

Before you can query this system you'll need to register and then log in. Your queries and their resulting data are associated with your account, so don't forget your login! But if lightning strikes, skies fall and you've consequently forgot your password it can be emailed to you from here. Your email may be used to contact you about your account, as well as optionally notifing about query completion. Your UserID will be used to identify you to the group world; should you choose to participate in this please pick a 'nice' UserID. A personal database, or MyDB is also created and assigned to your account on registration.

The 'MyDB' (Important!)

Your MyDB is a personal database. Once you have clicked on the MyDB tab after registration, your MyDB database will be created once and for all. You have full privileges in this context. For example, you can create/alter/drop tables, functions and stored procedures. The MyDB is designed to be a staging area where you may edit and refine your requested tables before eventually extracting them to a local copy in CSV, FITS, etc. format. TAKE THIS OUT?Please note that you can extract table results only from MyDB, so if you want to use any tables you must first place a copy of then into MyDB.

Getting your table data into MyDB

Getting data into MyDB is easy. Just write your query, then click submit and it will automatically create a table using the name from the 'Table' field (directly above the query box) in MyDB containing the results of your query. Alternatively, you use an 'into' statement in your query; an example of this is below.

Select top 1000 ktc_kepler_id,kic_ra,kic_dec into mydb.MyFirstTable from kepler_ktc_kic_ads_v

This will create a table called 'MyFirstTable' in MyDB containing the columns specified by the select clause from MAST's Kepler database. Note that 'into' statements are generally placed directly before the 'from' keyword of a query. Also, if your query doesn't have a 'from', then 'into' just goes at the end. An 'into' statement in your query always takes precedence over whatever is in the 'Table' field. So if you have both, only the one in the query is used. Please note that the 'Table' field is only used when the Submit button is clicked.

Queries submitted using the quick method will display results by default instead of automatically creating a table.

Downloading table results (also functions, procedures) from MyDB

Once you've got a table in MyDB, you can download that table in CSV (Comma Separated Variable), VOTABLE or FITS file format. This is done from the MyDB page. To do this, click MyDB (at the top), then click the table on the left, click 'Download', choose the download format and click 'Go'. You will be taken to the output page, where your output is listed under pending output, refresh this page periodically and the output will move to available output, if the output job is expected to be large you can log out and check at a later time. Once the output job is complete click on the download link to retrieve the file. Downloads have a lifetime of one week in the output area. The availability of the tables you can download is shown in the 'Output' tab in the top banner.

Importing table data into MyDB

You can also import your own customized tables into MyDB. This is done from the Import page link, located in the top banner. More information on importing tables is on that page.

Context

Before running CasJobs queries, it is important that you understand the term 'Context' by which we mean the particular database on which your query will be run. This is determined by the Context dropdown list on the Query page. So if the 'kepler' is selected, then your query will be run in kepler database, if 'MyDB' then MyDB, etc.

Queue

You can think of a 'queue' is a list of jobs awaiting execution. The 'Queue' selection, also a dropdown list, will determine the maximum time a query will be permitted to run (maximum is 500 minutes). Every context has at least one queue. If your chosen context has more than one queue it is best to choose the queue that is closest to the amount of time you think your query will take. Queues are optimized for queries of their time limit, so choosing an appropriate queue is generally faster.

Referencing Objects outside of their Context

For those situations where you'd like to reference data outside of your selected context you can prefix the foreign object. If you've been following this guide, you've already done this. This query, which returns the Kepler identification number and coordinates for objects, uses a 'MyDB' prefix:

Select top 1000 ktc_kepler_id,kic_ra,kic_dec into mydb.MyFirstTable from kepler_ktc_kic_ads_v

So assuming the context of this query was 'kepler', this means that 'MyFirstTable' is to be created at 'MyDB', not 'kepler'. Using prefixes, you could rewrite this query so it runs in 'MyDB' context, like:

Select top 1000 ktc_kepler_id,kic_ra,kic_dec into MyFirstTable from kepler.kepler_ktc_kic_ads_v

Since the context (database) has been changed to 'MyDB', this time the prefix 'kepler' indicates that the kepler_ktc_kic_ads_v view is not 'MyDB', but 'kepler'. Context prefixes can be used on tables and stored procedures, but not on SQL functions. Valid 'context' prefixes are any that appear in the 'Context' dropdown list in the Query page.

Quick Execution

If you go to the Query page, you'll notice there are two buttons at the top and to the far right of the query space, one labeled 'Submit', the other 'Quick'. These are different types of query execution. 'Quick' is a synchronous execution action and will run your query right away at the currently selected context. It returns the results right below the query window. The Quick execution option is the quickest way to run something and is ideal for quick things like creating functions, procedures, top 10's, etc. However there are restrictions on this form of execution: 'Quick' queries are limited to one minute of execution, and they can only return about one megabyte of results before they are cancelled.

'Submit' Execution

For queries that exceed 'Quick' limitations use the 'Submit' button. 'Submit' is an asynchronous execution action, that is it submits your query to the selected context and queue and will run it as soon as a slot in that queue is available. You can check the status of your queries from the History page, link located at the top of the page. (You must be logged in.) Note carefully that no notification of completion of the query job will be sent to your computer.

CasJobs Limitations due to Distributed Execution

One of the main purposes of the CasJobs service is to provide "load-balancing," i.e. to run large queries in background (asynchronously) so that small queries can be run in real time. However, this does introduce certain limitations and differences between some advanced standard SQL capabilities and CasJobs which may be confusing for users using to running queries in real time. In practice, we do not expect Kepler users to have to be concerned about these limitations, but you should be aware of them. For further information see the corresponding paragraph in the GALEX CasJobs Guide page.

Group Sharing of Table Data

Our 'Groups' facility provides a mechanism for allowing other users to view specified tables in your MyDB. They are managed from the 'Groups' page, link located in the top banner. From there you can join a Group, create a Group, or invite other users to join your Group. Users within a Group can see and reference tables that other group members have published. The visibility of the work in your account is conrolled by a Group Visbility check box under 'Profile' in the top banner. You may publish any table in your MyDB from the MyDB page by clicking on the desired table and then clicking 'Publish.' You may remove a table from publication at any time, but you cannot drop or rename that table until it has been depublished. Group tables that you man see are listed at the MyDB page under Tables. You may reference that table in your own query by using the same name you see in the MyDB page, specifically group.[user].[tablename]. Other group members have read-only access to your published tables;

For details on additional topics please see see the GALEX CasJobs guide page.

Help