Thursday, May 14, 2009

Report Query Development Cookbook

Report development, independent of tool can be a very daunting task. The following pages are designed to give you a step by step process when developing report queries that (in my experience) has led to fewer errors, a more efficient development process, and more accurate output. Before we get into it though, I apologize for the lack of material over the past couple of months; starting a BLOG is one thing, keeping it up is quite another. I hope that you find the information in this post useful! Happy hunting!

The Example


The following request is a recent request I worked on with a client. It is moderately complex but still lends itself well to the topic at hand. The request was as follows:
Financial Aid Applicant Profile: All financial aid applicants for a selected term and aid year. Please include the applicant's name, ID, ISIR recieved date, verification message, EFC, college, student level, package complete date, dependency status, campus, offered amount, accepted amount, paid amount, unmet need, gender, ethnicity, total credits, and overall GPA.
As you've probably already surmised, this query was not trivial. However, when approached using the framework in the following pages, it becomes a much less daunting endeavor!

Step 1 - The Population


It is imperative that we always begin any query with the population. Simply (hah! yeah right!) identify a list of PIDMS that exactly corresponds to the number of individuals you're trying to report. In the case above, this means anyone who has applied for financial aid during a given aid year. The following query is what will gather the group in quesiton:

select spriden_id ID,
spriden_last_name LastName,
spriden_first_name FirstName
from rcrapp1, spriden
where spriden_pidm = rcrapp1_pidm
and spriden_change_ind is null
and rcrapp1_aidy_code = :aidy
and rcrapp1_infc_code = 'ISIR'
and rcrapp1_curr_rec_ind = 'Y';

As you can see, the query is very simple, although it's missing all the attributes (college, level, etc) that make this report useful. However, we now have the foundation against which all additions can be compared. Additionally, we can "spot check" a couple of IDs to make sure that this population is indeed correct (it may even be a good idea to contact the requestor and have them "sniff test" the number in the group before proceeding further. The last thing to do before adding information is to count the number of records returned. Often, this can be done using built in functionality in whatever tool you're using to write your query. For the purposes of this example, let's assume that the query above returned 1200 applicants for 0809. Proceed to step 2!

Step 2 - Adding Attributes


After completing step one, completing the report becomes a matter of "bolting on" additional information. This should be done very methodically and should start with requested attributes that can already be found in the table(s) already being used to gather the population. In this case, we're already using RCRAPP1 and SPRIDEN, so we should go through the request and look for any other attributes that come from either of those two tables. In this case there are a couple of attributes that can be added from the RCRAPP1 table:

  • ISIR Recieved Date (RCRAPP1_RCPT_DATE)

  • and Verification Message (RCRAPP1_VERIFICATION_MSG)


Because the information requested already exists in the table(s) used, there is no need for additional join conditions. This makes adding the data itself trivial:

select spriden_id ID,
spriden_last_name LastName,
spriden_first_name FirstName,
rcrapp1_rcpt_date ISIR_Recd_Date,
rcrapp1_verification_msg Ver_Msg
from rcrapp1, spriden
where spriden_pidm = rcrapp1_pidm
and spriden_change_ind is null
and rcrapp1_aidy_code = :aidy
and rcrapp1_infc_code = 'ISIR'
and rcrapp1_curr_rec_ind = 'Y';

Notice, there have been no changes to the WHERE clause of the statment and no new tables have been joined to the query, so the number of individuals selected should still be the same. SHOULD BE THE SAME. At this point, it's time to check. Rerun the query and make sure that your record count has not changed. This process will be repeated after each new set of attributes are added. Now, let's see what we have left:

  • EFC (RCRAPP4_SAR_EFC)

  • college (SGBSTDN_COLL_CODE)

  • student level (SGBSTDN_LEVL_CODE)

  • package complete date (RORSTAT_PCKG_COMP_DATE)

  • dependency status (RCRAPP2_C_DEPEND_STATUS)

  • campus (SGBSTDN_CAMP_CODE)

  • offered amount (RPRAWRD_OFFER_AMT

  • accepted amount (RPRAWRD_ACCEPT_AMT)

  • paid amount (RPRAWRD_PAID_AMT)

  • unmet need (RORSTAT_UNMET_NEED)

  • gender (SPBPERS_SEX)

  • ethnicity (SPBPERS_ETHN_CODE)

  • total credits (SUM(SFRSTCR_CREDIT_HR)

  • and overall GPA (SHRLGPA_GPA)


Notice in the above list, that I also tabulated the fields in which the data in question can be found. It may be beneficial to do this yourselves as that task can often be a difficult one. Having the list ready and waiting will make things much smoother when it comes to writing the actual SQL.
At this point, I would typically pick out what I feel are the easiest tables to add and start with them. In most cases, tables with a "B" in the third character are a good place to start. However, SGBSTDN is an exception to that rule (go figure) becuase it is EFFECTIVE DATED. Have no fear though, SPBPERS is very easy to add because the PIDM is the primary key, meaning that only one instance of any given PIDM (student) can be in the table at a time. That being the case, the only thing you have to worry about when adding SPBPERS is joining SPBPERS_PIDM to another PIDM already in the query. In this case, we'll join to SPRIDEN as shown below:

select spriden_id ID,
spriden_last_name LastName,
spriden_first_name FirstName,
rcrapp1_rcpt_date ISIR_Recd_Date,
rcrapp1_verification_msg Ver_Msg,
spbpers_sex Gender,
spbpers_ethn_code Ethnicity

from rcrapp1, spriden, spbpers
where spriden_pidm = rcrapp1_pidm
and spbpers_pidm = spriden_pidm
and spriden_change_ind is null
and rcrapp1_aidy_code = :aidy
and rcrapp1_infc_code = 'ISIR'
and rcrapp1_curr_rec_ind = 'Y';

Now, as we did before, we need to run the query again and make sure that the number of records returned is still 1200. Assuming that's the case, it's time to move onto our next set of attributes. This time, I'm going to bite the bullet and add SGBSTDN. See the query below (the additions, shown in bold).

select spriden_id ID,
spriden_last_name LastName,
spriden_first_name FirstName,
rcrapp1_rcpt_date ISIR_Recd_Date,
rcrapp1_verification_msg Ver_Msg,
spbpers_sex Gender,
spbpers_ethn_code Ethnicity,
sgbstdn_coll_code College,
sgbstdn_levl_code Level,
sgbstdn_camp_code Campus

from rcrapp1, spriden, spbpers, sgbstdn a
where spriden_pidm = rcrapp1_pidm
and spbpers_pidm = spriden_pidm
and spriden_pidm = sgbstdn_pidm
and sgbstdn_term_code_eff = (select max(sgbstdn_term_code_eff)
from sgbstdn b
where a.sgbstdn_pidm = b.sgbstdn_pidm

and spriden_change_ind is null
and rcrapp1_aidy_code = :aidy
and rcrapp1_infc_code = 'ISIR'
and rcrapp1_curr_rec_ind = 'Y';

The order and tables used for the join conditions are not terribly critical unless you're trying to increase the performance of your query (in which case, there are certain rules that can be applied). That being the case, as shown above I have joined SGBSTDN_PIDM to SPRIDEN. Additionally, note the subquery to find the student's most current SGBSTDN record. Alright, let's take stock of where we're at at this point:

  • EFC (RCRAPP4_SAR_EFC)

  • package complete date (RORSTAT_PCKG_COMP_DATE)

  • dependency status (RCRAPP2_C_DEPEND_STATUS)

  • offered amount (RPRAWRD_OFFER_AMT

  • accepted amount (RPRAWRD_ACCEPT_AMT)

  • paid amount (RPRAWRD_PAID_AMT)

  • unmet need (RORSTAT_UNMET_NEED)

  • total credits (SUM(SFRSTCR_CREDIT_HR)

  • and overall GPA (SHRLGPA_GPA)


To help get this article finished (for Pete's sake!!!), I'm going to add everything except the registered credits in the next step. However, they should be added one table at a time and tested at each step to make sure that the record set has not changed. In any case, the query with those fields can be seen below:

select spriden_id ID,
spriden_last_name LastName,
spriden_first_name FirstName,
rcrapp1_rcpt_date ISIR_Recd_Date,
rcrapp1_verification_msg Ver_Msg,
spbpers_sex Gender,
spbpers_ethn_code Ethnicity,
sgbstdn_coll_code College,
sgbstdn_levl_code Level,
sgbstdn_camp_code Campus,
rcrapp4_sar_efc EFC,
rorstat_pckg_comp_date PackgCompDate,
rcrapp2_c_depend_status DependencyStatus,
sum(rprawrd_offer_amt) Offered,
sum(rprawrd_accept_amt) Accepted,
sum(rprawrd_paid_amt) Paid,
rorstat_unmet_need UnmetNeed,
shrlgpa_gpa GPA

from rcrapp1, spriden, spbpers, sgbstdn a
rcrapp4, rorstat, rcrapp2, shrlgpa,
rprawrd
where spriden_pidm = rcrapp1_pidm
and spbpers_pidm = spriden_pidm
and spriden_pidm = sgbstdn_pidm
and spriden_pidm = shrlgpa_pidm
and sgbstdn_levl_code = shrlgpa_levl_code
and spriden_pidm = rorstat_pidm
and rorstat_aidy_code = rcrapp1_aidy_code
and rcrapp1_pidm = rcrapp2_pidm
and rcrapp1_aidy_code = rcrapp2_aidy_code
and rcrapp1_infc_code = rcrapp2_infc_code
and rcrapp1_pidm = rcrapp4_pidm
and rcrapp1_aidy_code = rcrapp4_aidy_code
and rcrapp1_infc_code = rcrapp4_infc_code
and rorstat_pidm = rprawrd_pidm

and rorstat_aidy_code = rprawrd_aidy_code
and sgbstdn_term_code_eff = (select max(sgbstdn_term_code_eff)
from sgbstdn b
where a.sgbstdn_pidm = b.sgbstdn_pidm
and spriden_change_ind is null
and rcrapp1_aidy_code = :aidy
and rcrapp1_infc_code = 'ISIR'
and rcrapp1_curr_rec_ind = 'Y'
group by spriden_id, spriden_last_name, spriden_first_name,
rcrapp1_rcpt_date, rcrapp1_verification_msg, spbpers_sex,
spbpers_ethn_code, sgbstdn_coll_code, sgbstdn_levl_code,
sgbstdn_camp_code, rcrapp4_sar_efc, rorstat_pckg_comp_date,
rorstat_unmet_need, shrlgpa_gpa;

Finally, the only thing left is to calculate the total credits for each student. This can be done in two different ways:

  • Join to SFRSTCR and sum(credits)

  • Write a SCALAR SUBQUERY to do the calculation


In cases where only one or two fields are required from a new table, I prefer using the second method (SCALAR SUBQUERY). It acts as an outer join (if required) and performs better in many cases.