Search
Close this search box.

1 million rows and SAINT still wants more

While this might be a quickie, it’s a biggy. A big one in terms of the amount of data just uploaded through SAINT. In fact, we’ve just uploaded around 1 million rows of data, with 6 columns per row. And it didn’t even blink! Gotta love that! So why do we have a million rows of data? Customer segmentation of course.

While this might be a quickie, it’s a biggy.  A big one in terms of the amount of data just uploaded through SAINT.  In fact, we’ve just uploaded around 1 million rows of data, with 6 columns per row.

And it didn’t even blink!  Gotta love that!

So why do we have a million rows of data?
Customer segmentation of course.

This was actually done for one of our other clients.

The rationale?

To segment conversions and transactions by customer type, segment, previous segment, needs group etc.  And SAINT enables that capability.

How?

Firstly create an eVar that the raw identifier will go into.  This might be an account number, a customer ID etc.  Then, using the admin, create the classifications on the eVar for the relative columns you need.  At this point I always create the classification hierarchy as well, just so I can envision how I want the data to be reported and drilled down though.

When you create the classifications, the SAINT file is also created and made available for download.

I opened the SAINT template in Excel and copied my customer segment data into it in blocks of 100,000 records.  There’s a number of reasons for this, not the least of which is to keep the file size down, but also to make it easier if an upload does fail – at least you can deal with 100,000 rows better than 1 million rows.

So I’ve now got 10 files, each file contains 100,000 rows and 6 columns of data per row.  Each file was about 5mb.

You can’t upload that much data through the browser, so you need to use the FTP Import capability.

In the SAINT admin, select Import File, click on the FTP Import and then Add New:

ftp_import

You’ll then get a popup that asks you to select a bunch of things to create an FTP account:

ftp_import_selection

Select the data to be classified, move the report suite or suites to the box on the right, select the import options and add in your email address.

Check the box and hit save.

A new FTP account has just been created on the Omniture servers and you’ll get a confirmation screen showing the address, username and password.

Open it up in an FTP client and upload your SAINT files to the FTP server.

You’re not quite done yet though.

You also need to create a series of empty files, with a .fin extension, named exactly the same as your SAINT files.  These are “finish” files and are crucial to the upload.  They’re completely empty files – any text editor can create them.  Just make sure they are named exactly the same, case sensitive.

Upload those .fin files and you’re done.

Now, go have a coffee, have some lunch or dinner or whatever and come back later.

Progress

You can kind of check on progress by refreshing the FTP list of files.  Omniture removes the files from the FTP directory when it begins to process them, so you can kind of get an idea of where things are.

Time Frame

I uploaded the files around 4pm.

At 10:30pm I did a data extract by FTP of all data to see where it was up to…it was done.  Shortly thereafter, I got an email saying it was done, without any failures.

Easy as pie.  No muss no fuss.

While we’re using customer segments, it could just have easily been customer demographics, technographics or any other form of data.  The point is, 1 million rows and it didn’t even blink.

There’s a few things to watch out for though when importing that much data.

There is a limit on the amount of unique values (500,000) that will be reported against in a given month.  We’re ok – we won’t see that limit.

Recommendations are that file sizes be kept under 30mb for the initial load, and then subsequent refreshes less than 5mb.  So we’re still ok.

And the import time will vary depending on many things, including how busy their import routines are.  You get in the queue and everyone loves a queue.

But that was it.  1 million rows of customer data now available for segmentation nirvana in SiteCatalyst – and DataWarehouse, and Discover, and Test and Target.  We’re off to the races!

And while the first run of this was a manual run, future updates can easily be automated now that the FTP site is created.  Just remember your .tab and .fin files must be named the same.

The content and advice contained in this post may be out of date. Last updated on July 6, 2011.

Contact us

to discuss a range of services and support to suit your business needs and goals.

* Required field

Latest Blog Posts

Need Some Help?

We can work onsite or remotely with you and your team to provide capacity uplift or ongoing support as you need.

Need additional MarTech resources to supplement your team for special projects or to provide given expertise?

Data quality and integrity is key to any data strategy. We undertake audits and health checks that can give you peace of mind.

If you know your data could be working harder, but you’re not sure where to start, we can help.

We can help you build dynamic dashboards based on important metrics to fully inform the business.

Is it a CDP or a DMP that is right for your organisation? Let us help you work through the pros and cons.

Let us show you how to bring your online and offline data together to create a best picture of your customers.

Free assessments

Martech Talks: The End Of Cookies

This webinar was recorded in May 2024.

Note that the information contained in this presentation should not be taken as legal advice. Digital Balance and its partners recommend that you undertake your own legal investigation.

Martech Talks: The Four Stages Of Attribution Excellence

This webinar was recorded in April 2024.

Download the full 2024 Digital Experience Benchmarks report from Contentsquare.

Note that the information contained in this presentation should not be taken as legal advice. Digital Balance and its partners recommend that you undertake your own legal investigation.

Martech Talks: The Four Stages Of Attribution Excellence

This webinar was recorded in October 2023.

Note that the information contained in this presentation should not be taken as legal advice. Digital Balance and its partners recommend that you undertake your own legal investigation.

Martech Talks: Privacy and Data Governance

This webinar was recorded in August 2023.

Note that the information contained in this presentation should not be taken as legal advice. Digital Balance and its partners recommend that you undertake your own legal investigation.

Martech Talks: Privacy Changes and Data Security

This webinar was recorded in July 2023.

 

Note that the information contained in this presentation should not be taken as legal advice. Digital Balance and its partners recommend that you undertake your own legal investigation.