Please ignore my earlier email. I accidently sent the email before it was completed!

We are seeing a very strange behavior from MonetDB from simple queries on a single table.

In monthly_tou_avg_price table, we have monthly average aggregate data that were inserted from a very large table (> 2 billion) with hourly data. We have created the aggregate table to speed up certain queries. Strangely, we are getting incorrect results from very simple queries on the aggregate table.

Q1 below returns 9,890 records with yyyy_mm column having empty value for the 2012-07 records. All of the 2012-07 records are missing the yyyy_mm value. We ran the SQL statement that inserted those records and it does not show any empty yyyy_mm values. If we check for empty string or null value in yyyy_mm column (Q2 and Q3) against the aggregate table, we do not get any hits. If we query for yyyy_mm= '2012-07' as in Q4, we get exactly 9890 records showing yyyy_mm populated with value 2012-07. How could this be possible, Q1 showing empty yyyy_mm values for the same 9,890 records and Q4 showing 2012-07 yyyy_mm values?

Q1: select * from monthly_tou_avg_price where header_id = 6    
Q2: select * from monthly_tou_avg_price where header_id = 6 and yyyy_mm = ''
Q3: select * from monthly_tou_avg_price where header_id = 6 and yyyy_mm is null 
Q4: select * from monthly_tou_avg_price where header_id = 6 and yyyy_mm = '2012-07'

Q5 below returns empty value for yyyy_mm even though yyyy_mm column does not have any empty or null values. Q6 does not return any empty or null yyyy_mm values and contradicts Q5 result by returning 2012-07. Furthermore, Q7 contradicts Q5 with 9,890 records with 2012-07 value in yyyy_mm column.
 
Q5: select distinct yyyy_mm from monthly_tou_avg_price where header_id = 6  
Q6: select distinct yyyy_mm from monthly_tou_avg_price 
Q7: select * from monthly_tou_avg_price where header_id = 6 and years = 2012 and months = 7

We also ran an UPDATE statement to set yyyy_mm to 2012-07 for rows with header_id = 6 and years = 2012 and months = 7. And, 9,890 records were successfully updated. However, Q1 through Q8 returned same results as above with no improvement.

We have not seen this kind of issues in other tables. This problem seems to be limited to a table that is populated with an INSERT INTO statement. We checked the result returned by the select portion of Q8 and yyyy_mm is populated with 2012-07. But, the aggregate table seems to end up in a weird state where it returns incorrect results. 

Q8: insert into "monthly_tou_avg_price" 
      select a.header_id, extract(year from a.opr_date) as y, extract(month from a.opr_date) as m, 
               b.yyyymm as yyyy_mm, b.svalue as TOU, a.resource_name,
               round(avg(a.lmp_cong_prc),5), round(avg(a.lmp_ene_prc),5), round(avg(a.lmp_loss_prc),5), round(avg(a.lmp_prc),5), round(avg(a.ns_clr_prc),5), 
               round(avg(a.rd_clr_prc),5), round(avg(a.ru_clr_prc),5), round(avg(a.sp_clr_prc),5), round(avg(a.shadow_prc),5),
               round(avg(a.fuel_prc),5), count(lmp_cong_prc) as actual_lmpcong_tou_count
       from oasislive.price a
        join oasislive.time b
         on a.yyyymmddhh = b.yyyymmddhh
    where a.opr_date between '2012-07-01' and '2012-07-31'
 group by a.header_id, y, m, yyyy_mm, TOU, a.resource_name
 order by y, m, a.header_id, TOU, a.resource_name;

We can see this INSERT INTO related issue with other queries. Q9 shows that we do not have any values for 2011-03 ON period which is incorrect. In fact, if we specify '2011-03' as in Q10, it shows that we do in fact have values for the 2011-03 ON period!

Q9: select yyyy_mm, tou, avg(lmp_prc) from monthly_tou_avg_price group by yyyy_mm, tou order by yyyy_mm, tou

yyyy_mm tou L6
.
.

2011-03 OFF 16.721532797520894
2011-03 ON (null)
2011-04 OFF 15.427080074671187
.
.

Q10: select yyyy_mm, tou, avg(lmp_prc) from monthly_tou_avg_price where yyyy_mm = '2011-03' group by yyyy_mm, tou order by yyyy_mm, tou

yyyy_mm tou L11
2011-03 OFF 16.72153279752101
2011-03 ON 30.276486575893575


I apologize for the long-winded email. I just wasn't sure how to describe what I was seeing.... Please let me know if there is any additional information that I can provide you with. By the way, we are currently using August 2011 -SP2 release, but saw this issue in older releases as well.

Best regards,

Henry




We are currently running MonetDB August 2011 SP2 release on CentOS