Tuesday, December 4, 2012

How to write SOQL to traverse across two objects and their junction object


Last month I built a prototype for a Salesforce organized event in Calgary to show how to build applications in the cloud for the energy industry.

It was a fairly simple system:  it's about deploying rigs to wells so that drilling can happen, and so that oil can be extracted.

This is a prototype to show what Salesforce can do so the use case is very simplified.  To model this use case, I have a Rig object, a Well object and a junction object called the Rig Deployment that records which rig is deployed to which well, start date and the scheduled end date, etc.  This Rig Deployment object contains all current and past deployments so this object has potential to contain a lot of data.

Using the schema builder, the entity relationship diagram looks like this:




The first screen of the prototype is to display a google map of deployed rigs in Canada (there are already lots of examples online to show how to display google maps with Visualforce and Apex so I will not cover that here, however if you want me to go through that then please leave me a note).

This first screen requires a query into the Rig Deployment object because the date related fields in the object contain information on which rigs are currently deployed.  Remember that this object contains current and past data, and we only want the current data.  All I need to display is a map with wells that have some rigs deployed at the location, the names of the rigs deployed and the scheduled completion date.

If you have read my last blog, you already know that when I started learning SOQL I had a few challenges because I had to change my mindset when it comes to writing the query language.  In traditional SQL you can write the query with a simple SELECT statement and a few INNER JOINs and you're done.  Once again, writing a SOQL query to traverse across two objects and the junction object was a daunting task at first.  You really need to divide and conquer.  Here is how.

First, you need to know that there are two ways in which you can traverse objects in SOQL.  One is to go from child to parent, and one is to go from parent to children.

Going from child to parent is pretty straightforward because all you need is the relationship name (which ends   in __r for custom objects).  The Rig Deployment object is the child object of both rig object and well object.  So let's say you want to retrieve the name of the well, then the SOQL is simply:

SELECT 
    Scheduled_Completion_Date__c,
    Well__r.Name 
FROM 
    Rig_Deployment__c 

which will return all the records in Rig Deployment object with the name of the well.  Note that because we are traversing the information from child to parent, when we access the name of the well, we need to indicate the relationship by using Well__r instead of Well__c.  Remember we only want the information of currently deployed wells, so we need to include the date filter.  The above will be rewritten as:


SELECT 
    Scheduled_Completion_Date__c,
    Well__r.Name 
FROM 
    Rig_Deployment__c 
WHERE
    Completion_Date__c = NULL

Going from parent to child looks a little more complicated but only because you need to write a sub-query.   Now this time let's say you want to go from Rig to Rig Deployment object to display a list of rigs and the scheduled completion date for the current deployment.  The SOQL is:

SELECT
    Name, 
    (
    SELECT 
        Scheduled_Completion_Date__c
    FROM
        Rig_Deployments__r
    WHERE
        Completion_Date__c = NULL
    )
FROM
    Rig__c

Remember that inside the sub-query you need to pluralize the object (because it's the child object and is supposed to have more than one record) and then append the pluralized object name with __r instead of __c because you are traversing from the Rig object to its related Rig Deployment object.

Now the last piece of the puzzle.  You need to combine the two queries together.  But look!  Did you see how almost identical the two queries already are?  The only differences between the two are:
1) The second query contains the column indicating the name of the well, and
2) The second query uses the pluralized object name that ends with __r in the FROM clause. 

That means in this case all you really need to do is to add the well name column into the second query and you're almost done.

SELECT
    Name, 
    (
    SELECT 
        Scheduled_Completion_Date__c,
        Well__r.Name
    FROM
        Rig_Deployments__r
    WHERE
        Completion_Date__c = NULL
    )
FROM
    Rig__c

Now, I said almost done because it seems to me SOQL joins by outer join by default.  So the above query will return all rigs, even though they are not currently deployed anywhere.

If you ONLY want rigs that are currently deployed, excluding rigs that are sitting idle, then you'll need to add a WHERE clause.  But worry not.  You already wrote it, sort of, when you wrote the first query above.  The WHERE clause will indicate that the rig must be in the Rig Deployment object that has a blank completion date.  So the final query is merely an extension to the last query.

This is the final SOQL (the Rig__c in the Rig Deployment object is the lookup field to the Rig object - that is, it stores the Salesforce record id).

SELECT
    Name, 
    (
    SELECT 
        Scheduled_Completion_Date__c,
        Well__r.Name
    FROM
        Rig_Deployments__r
    WHERE
        Completion_Date__c = NULL
    )
FROM
    Rig__c

WHERE
    Id IN
    (
    SELECT 
        Rig__c
    FROM
        Rig_Deployment__c
    WHERE
        Completion_Date__c = NULL
    )

You see that my main query retrieves data from the Rig object.  However it doesn't have to be this way.  You can also use the Well object as your main object in the query.  It all depends on what fields you are trying to retrieve.  The lesson I learned from this is that the junction object ends up sitting in the sub-query.  It is after all, the child object so it should be used in the sub-query.

Now, does anyone know is there an even simpler way to write an inner join without having to write the last WHERE clause I added above?



7 comments:

  1. Thanks ... Gr8 Article...
    This is what I needed ... :)

    ReplyDelete
  2. Rams Chowdary: I was looking for it for a while. Thanks much :)

    ReplyDelete
  3. Hi thanks for the example.
    I have one scenario where i have jjunction object and one lookup parent to one of the master to this junction. Can you help me with this.

    ReplyDelete
  4. Thanks. Could you also please also post about the data getting displayed in Maps using VF and Apex

    ReplyDelete
  5. You made it all really simple to understand. Thanks a ton.

    ReplyDelete
  6. Hello,

    If your query is only concerned about the current deployment of rigs, couldn't you just use the Rig_Deployment object as the basis for your data set and access fields on related records by traversing upward (from child to parent)? I think of this as a "inner join" from the child object perspective as long as you include a filter to ensure that there is a parent record specified for each child record. Let me know if this works for you:

    SELECT
    Rig__r.Name,
    Scheduled_Completion_Date__c,
    Well__r.Name
    FROM
    Rig_Deployments__c
    WHERE
    (
    Completion_Date__c = NULL
    AND
    Rig__c != NULL
    )

    ReplyDelete
  7. I havbe a question about junction object.
    I have a many to many relationship with 3 objects Position, Candidate, Job Application(Junction Object). I wanted to retrieve all the positions applied by a particular candidate and also the positions not applied by any candidate using one query. Is it possible using semi joins and anti joins in a query? Or I have to create separate lists and then combine them into third list.

    ReplyDelete

Please leave a comment.