Hello, I am using Forena to generate some large and complex cross-tab tables. Mostly Forena is working very well for my purposes and I'm really pleased with it. However some of my queries are extremely long (3000 lines of code) and that is causing some performance issues, as the Forena engine takes a long time to parse those SQL files. Most of the performance issues are not deal-breakers, more just a nuisance. There are one or two things that are more serious and I will file a separate issue report for that. In this thread I am just hoping to find out if there's some better way I could be using Forena, that would help me avoid the crazy-long queries. Perhaps I have missed something obvious and someone can set me straight on this.

In my cross-tab tables, each table cell contains a number generated by a query. To get a whole big data block, I am chaining these queries together using UNION ALL. So for one data block I might have around 40 queries chained together with UNION ALL. To make matters worse, there is a lot of code repetition, so I have a lot of nested logic for including different SQL files for repeated chunks of code. I'm using SWITCH statements to include different snippets based on parameters. I'm not very familiar with SQL prepared statements and SQL variables. Perhaps I could have used those to make my SQL code more efficient. But I ruled out that approach since each Forena data block is restricted to one statement.

I hope my question is not too vague or overly general. I'm just trying to make sure I'm on the right track here and not making a stupid mistake in my strategy with Forena. Thanks a lot for any comments or advice you might have.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

lucyp created an issue. See original summary.

metzlerd’s picture

I would be surprised if you were hung up on the parsing of code or the size of the SQL files, but I suppose it is possible.

Have you done any XHPROF profiling to verify your theory? One way to test would be to turn debugging on in the module and copy one of these huge queries (as it was built) and verify that the query turn around time was small in relationiship to the SQL query generation time.

It sounds like you're building crosstabs with SQL. Have you thought about using the crosstab query renderer (FrxCrosstab) to perform the crosstabs and rewrite your query into an aggregation qery? That would be the only thing I could think of that qualifies as a different way to use forena to achieve better performance.

In my experience query performance has been more of an issue than parsing performance in most cases. Can you tell me how many rows of data these queries typically return?

lucyp’s picture

Hi, thanks for the response.

Yes, I have taken the SQL output from debug mode and run that query and it's fast. I had previously written some bad, slow queries, so I already went through a whole process of fixing those, optimizing, etc. So now the query execution time is fast enough.

The parse time affects performance in various ways, which I will describe here. At least I assume this is the time taken for Forena to parse, but tell me if I'm off-base.

1. When I load the Forena Reports data tab, it takes a long time to load. This started happening after I introduced the super-long queries. At one point I had to increase my PHP max_execution_time just to get this page to load. It used to timeout before displaying the page, and I could see from the stack trace that it was dying in the middle of a Forena parsing function.

2. When I go to my report containing the longest query, the report page takes a long time to load, before I even run the query.

3. When I edit report parameter options via the UI, I run into a problem when I start to fill in the field for Source Data. The auto-complete goes crazy. It spends a very long time trying to autocomplete that field, but unsuccessfully. After I type in the name of the data source and then save the parameter options, I get an ajax error because autocomplete is still running. Even after that, some process continues to run in the background, taking a lot of CPU cycles. It's a problem and I will file a separate issue on this in case anyone would like to fix it. Meanwhile I'm configuring my parameters by editing the frx files, which is fine.

All these problems only started happening after I introduced the huge queries that I mentioned, so that's why I assume it's my huge queries causing the problems. I don't really understand it, since I'm not sure why the Forena engine would actually need to be parsing the queries at these times, especially for case number 3 above.

I'm not exactly sure what you mean by aggregation queries, but I am indeed using the Forena crosstab renderer. And in my queries I make extensive use of COUNT and GROUP BY. After I chain all the queries together into a mega-query using UNION ALL, the resulting mega-query gives something like 100 rows of results, which are displayed by the crosstab renderer as a beautiful table.

So, just to be clear, it's not the query execution time that's a problem in my case. Anyway if I do have a query taking a long time to execute, that's very easy to explain to the client, that we are running a very complex report and they will just have to wait a minute for the results. Harder to explain is when they have to wait a long time just for the report page to load, before they can even select their parameters. As a work-around, I might build a separate page with a form just for the client to enter the parameters and feed them to the report.

Thanks!!

metzlerd’s picture

Do you have xhprof installed to do some performance profiling?

Verify that you have at least one of the report parameters listed as required. Sometimes what happens is the query performs fast, but if you don't have a required parameter, the query with no paramters takes a really long time to return rows. IF there are no required parameters, it will still run the datablock before the user has specified any. Again turning debug on for the data source will help you identify if this is the case.

If that doesn't help, then we will need XHPROF output to help me understand what I can do for performance inprovements in your case.

lucyp’s picture

The report has several required parameters, so I guess the report is not running, just taking a long time to load up.

I do not have xhprof installed, will do that and get back to you, thanks.

lucyp’s picture

Hi, I have installed xhprof, can view the table of XHProf output, though not yet the callgraph. Let me know if you need the callgraph too and I will work on configuring for that that. Anyway what can I send you from XHProf output? Screenshot? Or answer some specific question? Thanks.

lucyp’s picture

FileSize
1.9 MB

Ok, I have the callgraph available too now, see attached image. This is from the report I mentioned that uses a particularly long query. Let me know what other info I can give you from the xhprof output.

metzlerd’s picture

According to this graph, we're looking at 4500 calls to parseSQL. Are there nested data blocks involved here, that seems to be high. I'm going to suggest we take this offline for a bit. I'll send you a message on your contact form so that we can try direct exchanges.

metzlerd’s picture

Version: 7.x-4.5 » 7.x-4.x-dev
Status: Active » Needs review
FileSize
522 bytes

This may not totally fix your problem, but from xhprof output I noticed that the info file parser was being called on empty info sections, so this should eliminate that. Try this patch out and if it doesn't solve your problem, lets consider another xhprof test.

metzlerd’s picture

FileSize
1.16 KB

Actually here is a revision that should not parse SQL files that do not pass the skip test. That would be worth checking out as well.

lucyp’s picture

Wow!! The first patch helped somewhat, reduced the time for page load of report (before execution of report) by about 30%. But the second patch resulted in an almost instantaneous page load.

So, I don't really understand what you did to fix this, but it's amazing!

Thank you so much!

metzlerd’s picture

Title: I have huge SQL files -- am I using Forena Reports correctly? » Large numbers of Includes causes slow performance

  • metzlerd authored f4844e0 on 7.x-4.x
    Issue #2573351 by metzlerd: Large numbers of Includes causes slow...

  • metzlerd authored f4844e0 on
    Issue #2573351 by metzlerd: Large numbers of Includes causes slow...
metzlerd’s picture

You are quite welcome.

The problem was that the SQL files were being parsed even when the switch statements didn't evaluate the need. I basically made sure that the include files weren't parsed or loaded if we had figured out that they weren't needed.

It was easy once I had the xhprof data and sample queries in hand, so thanks for that.

metzlerd’s picture

Status: Needs review » Fixed

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.