Showing posts with label extract. Show all posts
Showing posts with label extract. Show all posts

Tuesday, March 27, 2012

Determine sequence in log

In an AUDIT_LOG table, with approximately 1.6 million rows, would there be
any way to extract a pattern of rows? Specifically, I am trying to select al
l
records that have an audit_log_id of '13' which would be immediately followe
d
by a record with an audit_log_id of '16', which in turn would be immediately
followed by a record with an audit_log_id of '2', and display the results in
these groups of three records?
Thanks for any help.
SteveCan you post some ddl, sample data and expected result?
Please provide DDL and sample data.
http://www.aspfaq.com/etiquette.asp?id=5006
AMB
"Steve B" wrote:

> In an AUDIT_LOG table, with approximately 1.6 million rows, would there be
> any way to extract a pattern of rows? Specifically, I am trying to select
all
> records that have an audit_log_id of '13' which would be immediately follo
wed
> by a record with an audit_log_id of '16', which in turn would be immediate
ly
> followed by a record with an audit_log_id of '2', and display the results
in
> these groups of three records?
> Thanks for any help.
> Steve|||Only if there is some field, IN THE TABLE, that allows you to "determine" ro
w
sequence. I mean if there exists a ccolun, with unique values, which when
sorted, will sequence the rows in the order in which "immediately following
"
has the meaning you want it to have. If that's so, let's say that column i
s
named <LogDate>.
You have to join the table to itself, where, for each record in the first
instance of the table is "Joined" to it's Immediate Follower, by "Joining"
based on the value of LogDate being equal to the Minimum value of all the
records with LogDate > Than this records LogDate...
Select Prev.*, Next.*, Last.*
From AUDIT_LOG First
Join AUDIT_LOG Mid On
Mid.LogDate = (Select Min(LogDate) From AUDIT_LOG
Where LogDate > First.LogDate)
Join AUDIT_LOG Last On
Last .LogDate = (Select Min(LogDate) From AUDIT_LOG
Where LogDate > Mid.LogDate)
Where First.audit_log_id = 13
And Mid.audit_log_id = 16
And Last.audit_log_id = 2
"Steve B" wrote:

> In an AUDIT_LOG table, with approximately 1.6 million rows, would there be
> any way to extract a pattern of rows? Specifically, I am trying to select
all
> records that have an audit_log_id of '13' which would be immediately follo
wed
> by a record with an audit_log_id of '16', which in turn would be immediate
ly
> followed by a record with an audit_log_id of '2', and display the results
in
> these groups of three records?
> Thanks for any help.
> Steve|||Thanks Alejandro.
Below is an example. I would like to return the sets of rows that are 13,
16, 2. THere are three examples in this table.
AL_EVENT_ID AL_DATETIME
2 9/24/04 9:16 AM
13 9/24/04 9:51 AM
2 9/24/04 10:21 AM
13 9/24/04 10:39 AM
13 9/24/04 1:17 PM
2 9/24/04 1:23 PM
13 9/24/04 2:05 PM
2 9/24/04 2:40 PM
2 9/24/04 2:43 PM
13 9/24/04 2:56 PM
2 9/29/04 8:13 AM
2 9/29/04 4:11 PM
2 9/30/04 8:32 AM
16 10/1/04 11:38 AM
13 10/1/04 12:28 PM
16 10/1/04 12:29 PM
2 10/1/04 12:29 PM
2 10/1/04 2:30 PM
2 10/1/04 2:46 PM
2 10/1/04 3:04 PM
13 10/1/04 4:40 PM
2 10/1/04 4:44 PM
13 10/1/04 4:58 PM
2 10/4/04 9:42 AM
13 10/4/04 3:23 PM
16 10/4/04 3:23 PM
2 10/4/04 3:23 PM
2 10/5/04 4:56 PM
2 10/6/04 9:35 AM
13 10/6/04 11:52 AM
13 10/6/04 12:47 PM
16 10/6/04 12:49 PM
2 10/6/04 12:49 PM
2 10/6/04 1:32 PM
13 10/6/04 4:06 PM
2 10/7/04 8:51 AM
2 10/7/04 11:39 AM
13 10/7/04 11:49 AM
"Alejandro Mesa" wrote:
> Can you post some ddl, sample data and expected result?
> Please provide DDL and sample data.
> http://www.aspfaq.com/etiquette.asp?id=5006
>
> AMB
>
> "Steve B" wrote:
>|||Thanks very much. I'm sure this is what I need. Not quite clear though on th
e
Prev.*, Next.*, and Last.*. It returns Msg 107 - column prefix doesn't match
.
I have added some example data to Alejandro's post.
Steve
"CBretana" wrote:
> Only if there is some field, IN THE TABLE, that allows you to "determine"
row
> sequence. I mean if there exists a ccolun, with unique values, which when
> sorted, will sequence the rows in the order in which "immediately followi
ng"
> has the meaning you want it to have. If that's so, let's say that column
is
> named <LogDate>.
> You have to join the table to itself, where, for each record in the first
> instance of the table is "Joined" to it's Immediate Follower, by "Joining"
> based on the value of LogDate being equal to the Minimum value of all the
> records with LogDate > Than this records LogDate...
> Select Prev.*, Next.*, Last.*
> From AUDIT_LOG First
> Join AUDIT_LOG Mid On
> Mid.LogDate = (Select Min(LogDate) From AUDIT_LOG
> Where LogDate > First.LogDate)
> Join AUDIT_LOG Last On
> Last .LogDate = (Select Min(LogDate) From AUDIT_LOG
> Where LogDate > Mid.LogDate)
> Where First.audit_log_id = 13
> And Mid.audit_log_id = 16
> And Last.audit_log_id = 2
>
>
> "Steve B" wrote:
>|||Sorry they were table aliases in my first incantation, and neglected to
change them... They should be First, Mid, and Last
The SQL should be
Select First.*, Mid.*, Last.*
From AUDIT_LOG First
Join AUDIT_LOG Mid On
Mid.LogDate = (Select Min(LogDate) From AUDIT_LOG
Where LogDate > First.LogDate)
Join AUDIT_LOG Last On
Last .LogDate = (Select Min(LogDate) From AUDIT_LOG
Where LogDate > Mid.LogDate)
Where First.audit_log_id = 13
And Mid.audit_log_id = 16
And Last.audit_log_id = 2
"Steve B" wrote:
> Thanks very much. I'm sure this is what I need. Not quite clear though on
the
> Prev.*, Next.*, and Last.*. It returns Msg 107 - column prefix doesn't mat
ch.
> I have added some example data to Alejandro's post.
> Steve
> "CBretana" wrote:
>|||Using your actual column name,
Select First.*, Mid.*, Last.*
From AUDIT_LOG First
Join AUDIT_LOG Mid On
Mid.AL_DATETIME =
(Select Min(AL_DATETIME) From AUDIT_LOG
Where AL_DATETIME > First.AL_DATETIME)
Join AUDIT_LOG Last On
Last.AL_DATETIME =
(Select Min(AL_DATETIME) From AUDIT_LOG
Where AL_DATETIME > Mid.AL_DATETIME)
Where First.audit_log_id = 13
And Mid.audit_log_id = 16
And Last.audit_log_id = 2
"Steve B" wrote:
> Thanks very much. I'm sure this is what I need. Not quite clear though on
the
> Prev.*, Next.*, and Last.*. It returns Msg 107 - column prefix doesn't mat
ch.
> I have added some example data to Alejandro's post.
> Steve
> "CBretana" wrote:
>|||Thank once again for your help. It works like a charm
Steve
"CBretana" wrote:
> Using your actual column name,
> Select First.*, Mid.*, Last.*
> From AUDIT_LOG First
> Join AUDIT_LOG Mid On
> Mid.AL_DATETIME =
> (Select Min(AL_DATETIME) From AUDIT_LOG
> Where AL_DATETIME > First.AL_DATETIME)
> Join AUDIT_LOG Last On
> Last.AL_DATETIME =
> (Select Min(AL_DATETIME) From AUDIT_LOG
> Where AL_DATETIME > Mid.AL_DATETIME)
> Where First.audit_log_id = 13
> And Mid.audit_log_id = 16
> And Last.audit_log_id = 2
> "Steve B" wrote:
>|||Yr very welcome!
"Steve B" wrote:
> Thank once again for your help. It works like a charm
> Steve
> "CBretana" wrote:
>

Wednesday, March 7, 2012

Destination Spreadsheets in SSIS

Hello -

I am dealing with SSIS in VS 2005... Trying to convert all my DTS packages... So, basically all my packages will extract some information from a database and load the results into a spreadsheet.

To start I am trying to do a TOP 1 returning a string from the db... The first row has column names but mysteriously the package will start to write the expected results in the third row instead of the second one. The second row will remain blank and if I do a preview against the destination spreadsheet within the pkg I will see a NULL value in the second row and then in the third row I will see the string I was expecting.

Tried the following with no success:

Regedit.exe, in Hkey_Local_Machine/Software/Microsoft/Jet/4.0/Engines/Excel/ do TypeGuessRows=8, ImportMixedTypes=Text, AppendBlankRows=0, FirstRowHasNames=Yes

Any help would be really appreciated.

I noticed some people are looking at this topic but no answers up to now... Should I add more information so that maybe you can help?

Thank you.

|||

I just tried it and it works for me. Header in row 1 and data starts and ends in row 2.

Please post more info.

|||I have not seen that behavir either. Perhaps is something in the source table that is causing that. Have you tried deleting the excel file and the connection manager and creating them back?|||

I am still having issues... I’ve created a new package using table pubs.dbo.jobs. I added a "data flow task" to the "control flow" pane. After that, I went to "data flow" pane and added a "OLE DB source" which has the sql query "select top 1 job_id from pubs.dbo.jobs", returning 1.

I also added an "Excel Destination" which points to a spreadsheet with a column header called "ID". All cells in this xls are defined as numbers... After that, I tied both components in the "data flow" pane and ran the package (in the connections managers I have the OLE DB and the excel connections).

SSIS succeeded and loaded 1 into the destination spreadsheet, in the third row instead of the second one. When I go to the "Excel Destination", hit edit and do preview, I see ID in the first row (which is the column name), NULL in the second row (unexpected) and 1 in third row (results from sql query).

Please, advise. Thanks in advance.

|||I follow the same steps you described and everything looks right. So, How are you creating the excel file? I created a connection manager to an unexisting excel file; the in the Excel destination componnet I use the New button in the 'name of the excel sheet' to create it as suggested by SSIS. Perhaps you are pointing to an existing file that has something in it it that causes the behaivor you see.|||

Rafael -

This does solve the problem. I really use an existing spreadsheet so I believe something wrong was happening while trying to match data types (sql query X existing destination sheet).

However, I really need to use this existing spreadsheet... This is because when my package is executed, the destination file is overwritten by a copy of the empty template file with column headers only. So the template file should be used... This is the way I always did for DTS packages and I'm wondering if you have another way to append the destination file in the second row always (otherwise file will keep growing).

BTW, these conversion data type "issues" in SSIS are a pain... I just don't understand why Microsoft eliminated implicit conversions (we have to convert explicitly in the sql query or either add a Data Conversion transform task mostly when we deal with strings - more to be done in addition to all that we have on our plate already). I miss DTS packages...

Please, reply with any comment on how to deal with the problem I have using an existing excel sheet. When I create a new one as suggested by SSIS, it does solve the NULL issue but I need to use a template file to assure I am starting in the second row always.

Thanks for your help.

|||

Gabriel Souza wrote:

BTW, these conversion data type "issues" in SSIS are a pain... I just don't understand why Microsoft eliminated implicit conversions (we have to convert explicitly in the sql query or either add a Data Conversion transform task mostly when we deal with strings - more to be done in addition to all that we have on our plate already). I miss DTS packages...

They may create a little more work upfront, but I'll readily trade that for not having to deal with implicit conversions. I've seen too many tools that use implicit conversions guess wrong about the datatypes, and you get no errors, no warnings. I've seen input files get columns re-ordered, but due to implicit conversions, the import process didn't throw any errors and the wrong data was imported for two weeks before a user questioned the values they were getting.

An explicit conversion is a lot safer for a production quality application.

|||

If you use an existing excel, the Excel destination will APPEND to the file. So if it detects that the original excel has 3 rows, it will write from the 4th row.

Even if your rows 2,3 look empty to you in the excel, they may have been used previously, and may have formatting/comments etc.Excel will not see these rows as empty.

One good way to check is to open the excel, and press CTRL_Home and then CTRL_Shift_End. If rows 2 and three are being marked, then that means excel detected three rows.

To solve this problem, mark A2 to Ctrl_Shift_End and delete cells (move cells up), then save the excel. Then check again to see if row 2 and 3 are being marked.

I had this problem, and it got solved when I edited the excel properly.

|||

Karfast -

Thanks for the reply but this does not help... The template file is in the correct format and we can't have manual work here...

The packages are supposed to run daily and they can't keep appending because they will grow indefinitely. This is why we use the template file and copy it to overwrite the destination file. The file will then be empty and the package will start in the second row again (at least this is the way we always did for dts's).

The problem I have is.... When this template file is copied the NULL issue happens again and it looks like SSIS gets lost with the data types because I did not create the excel sheet by clicking the "new" button in the "excel destination editor" as Rafael suggested (since I am dealing with the new template file by the time package runs).

Rafael suggestion does solve the problem with the unexpected NULL (because the destination sheet is created as SSIS suggests when we click the "new" button) but when the template file is copied the issue is raised again... This process of appending the file in the second row always should be automatic and we can't rely on manual deletions. As I said, the template file is formatted correctly.

Let me know your thoughts.

Thanks a lot.

|||

Gabriel on a recent project we ran into a nearly identical issue. At the end of updating one of our dimensions we needed to output the results to reflect any changes so our Excel and database table were completely in sync. We solved the issue using a template Excel file, but the template followed these standards:

All cells formatted as text

No forumulas

Header row only

All rows below header deleted initially to avoid any previous entries

Once the template was created it was untouched. Within our SSIS package we used a File System Task to copy the template over the existing file then in our data flow we took our source then ran it through data conversion (unicode string 255 or double precision floats) and output it to our new excel file. We used an Excel destination to map to our overwritten file and for the file task we had a before (template) and after (destion).

I can send you the package and template if you are still having difficulty but the lessons we learned were the formatting of the Excel template and matching the data types from the database to the Excel destination.

Hope this helps! Good luck!

|||

Hi Gabriel,

I meant that the template file should be correct, and that is the only manual operation.

Now if the template has col headers in row 1, and data in its second and third rows, all destination files will also have it. And hence, your first real row will be in the 4th row.

Please check your template file.

I had exactly the same scenario, and same problem, and it got solved.

HTH

Kar

|||

ADMariner -

Your suggestions are pretty good and I see you truly understood the issue I was having. I tried to apply these steps but unfortunately I was still having the issue... Even making sure all the 60,000 rows after the first row were being deleted properly, the NULL value would still show up.

Only thing that solved the problem was to create the "destination" sheet by clicking the "new button" in the "excel destination editor" as Rafael suggested. As soon as I applied this same trick to the template file and pointed the SSIS package back to the destination spreadsheet, the issue went away.

I really appreciate all the help from everybody. Thanks a lot Smile

|||

ADMariner wrote:

We solved the issue using a template Excel file, but the template followed these standards:

All cells formatted as text

No forumulas

Header row only

All rows below header deleted initially to avoid any previous entries

Once the template was created it was untouched. Within our SSIS package we used a File System Task to copy the template over the existing file then in our data flow we took our source then ran it through data conversion (unicode string 255 or double precision floats) and output it to our new excel file. We used an Excel destination to map to our overwritten file and for the file task we had a before (template) and after (destion).

After you wrote to the template and then opened it in Excel, were your floats represented as numbers or text? I haven't been able to get it to write floats to an existing file (header row only) that remain formatted as numbers in Excel. Instead, they're text with the related side-effects (left justified, green triangle in upper left warning about a possible number in a text field, conditional formatting doesn't work correctly, etc.) I tried having the Excel destination create the spreadsheet, and then it DID write the floats as numbers, and would continue appending numbers on subsequent runs, but when I removed the data from that file (leaving the header) and tried to write to it again, the floats were then written as text.