Distribution of Dynamic Binary Data Using ASP
Abstract
In this paper we will examine one way to distribute dynamic
binary data via HTTP to clients using ASP (Active Server Pages).
The problem, specifically, is distributing a self-extracting
compressed archive with the ability to change the data inside
of the archive in real time on the server side before sending
it to the client. However the method presented here can easily
be generalized to incorporate any binary data of any file
format. This will require a few nifty tricks, but other than
that it is fairly straightforward.
Building the database
Before we can construct the ASP code to change and serve
the appropriate data we must have a source for the original
binary file. To facilitate this we will use a simple database.
The design of the database is simple. We will use a Microsoft
Access database with one table ('PacT') and two fields in
it: 'Version' (Text) and 'Pac' (OLE Object). (Here 'Pac' is
short for 'Package'.) You can obviously add more fields later
if your application calls for it.
Next we create a record in the table 'PacT' with an arbitrary
value for 'Version' and leave the 'Pac' field empty. This
is the point at which we encounter our first problem. That
is, how do we get the binary data into the database? There
really isn't an easy way. The best way I found is to load
up the database in a VB (Visual Basic) program and insert
the data in that way. Attached is the source code as well
as a compiled executable [1] for the program created to do
this. It is very short and simple; however it has not been
tested or optimized to a high degree so it might need a little
bit of tweaking. The main function of interest is 'Insert
Data', and then to confirm that it got in there ok use the
'Write Data To File' function. As a final check when the database
is opened again in MS Access for the record of interest the
'Pac' field should say 'Long binary data'. Once this point
is reached the database is good to go.
Before we get to the next section, for those of you who are
interested on how I arrived at this method, read on. My first
approach was not using a database, but rather trying to load
up the data into a string and then passing that string on
to the client. However as you might imagine when you are dealing
with large files this quickly becomes an extremely difficult
and tedious task. I am also including the source code [2]
to a program that I first created to read a file and then
spit out code that would load that file up into a string,
byte by byte. Note that I am not including a compiled version
at all, since you will have to change some paths and things
yourself anyway. It is not nearly as refined as the final
program I used to load data into the database, but it actually
works (for the most part). If you are distributing a file
< 1KB you *might* want to consider using this method under
special circumstances, but I would strongly advise against
it. The main reason I'm including this source code is because
it is interesting; it turns out that it is not very useful.
Querying the database
This part is actually pretty simple. Use a simple query to
get a record set with the data of interest in it (only one
record for our example).
One very important thing I would like to mention here, however,
is that you *must* enable ASP processing for the file extension
of interest (here *.exe) on your IIS account. This created
a huge problem for me at first. I turned this on, wanting
to distribute one of my programs dynamically, but I assumed
since the other executables were straight binary (no ASP code
to be processed) that they would be passed right along as
usual. I was wrong. By simple probability, you can see how
a file of even a modest size could easily contain a '<%'
or '%>' in it somewhere. Once the ASP processor hit this
point it returned an error and nobody was able to retrieve
these other programs. Because of this you must distribute
all of the files of this type within your IIS account from
a database, even if you are not changing any of their content
on the fly.
Depending on your hosting situation, it shouldn't be too
hard to enable ASP processing for unusual file types. Chances
are that you do not have in-house hosting, since if you do
then it might be easier just to call a local DLL to do it
all for you. Most competent hosts won't put up too much of
a fight against this odd, yet secure (as long as you know
what you're doing and your host does too), request.
Another important thing is to put the following two lines
at the top of your code.
Response.buffer = TRUE
Response.ContentType = "application/octet-stream"
The first one allows the download to complete faster by telling
the server to execute all ASP code before sending anything
to the client [3]. The second line instructs the server to
tell the client that the data you are sending is actual binary
data and thus it will prompt the user with the typical file
download prompt and so forth. Some example code might be:
<%
Response.buffer = TRUE
Response.ContentType = "application/octet-stream"
strConn="provider=Microsoft.Jet.OLEDB.4.0;
data source=D:\Inetpub\mydomain.com\db\mydb.mdb"
set conn = server.createobject("adodb.connection")
set rs = Server.CreateObject("ADODB.Recordset")
Conn.Mode = 3 '3 = adModeReadWrite
conn.open strConn
SQLStmt = "SELECT Pac FROM PacT WHERE Version
= 100"
set rs = conn.Execute(SQLStmt)
%>
Feeding Data
This is the good stuff that makes it all possible. If you
want to just pass the data in the database on to the client
without any modification, you would write:
<%
dim mFieldSize, mBytes
mFieldSize = rs.Fields("Pac").ActualSize
mBytes = rs.Fields("Pac").GetChunk(mFieldSize)
response.binarywrite mBytes
%>
This gives you the basis needed to send just about any kind
of data that you want. To demonstrate we will return to our
specific case of distributing a self-extracting executable.
This archive contains one text-only file ten characters/bytes
long which we will assign a random value for each download.
During installation, this text file can be copied to a certain
location and later read by the program or used in any other
way imaginable. We are now immediately faced with a few problems:
1) Compression: We do not want to have to compress our new
ten byte file every time we are making a distribution. This
would be cumbersome, probably prompting the need for a file
compression ActiveX control or the like. To get around this
when creating the initial archive, our original file will
be flagged as uncompressed. It is still possible, and advisable,
to compress all other files except for this one. For a ten
byte file you need not think twice about the increase in distribution
size this will cause.
2) CRC Values: If we simply change the contents of the ten
byte text file and send it off to the client as-is when the
user tries to unzip/open the file they will receive a CRC
error indicating that the file has been corrupted. CRC (Cycle
Redundancy Check) is a hash of some data which is unique to
that specific data, such that if any modification to that
data is made the CRC will be different. This is a mechanism
to ensure that a file is not altered during a download. In
PKZip, WinZip, and every other compression program which follows
the standard, a CRC value is created for each file and stored
along with the file inside the archive. So when we change
the value of the text file we must recompute a new CRC value
for it and replace the old CRC with the new one. After a little
bit of research and thought it is no longer a large hurtle.
A quick trip to the PKZip application note [4] reveals that
the CRC value is stored in two places: a local file header
and the central directory (assuming no data descriptor is
included, which is generally the case). This is where a hex
editor comes in handy. I highly recommend WinHex, but just
about any one will serve our purpose here. I suggest you take
a look at your file, search it for the current CRC value of
the static file (there's a column that tells you the CRC for
a file if you select to view it), and get comfortable with
the general layout.
I was able to find some pretty good ASP source code for a
CRC calculator off of Planet Source Code. It was a while ago,
so I'm not sure which entry it was from, and thus am unable
to give proper credit for it. If you are the author please
send me an email and I will give you credit here. At any rate,
I am including the CRC source code [5] as well. Just paste
it into the bottom of your ASP page and go from there. Here
is an example of some source code which will generate a random
value consisting of numbers from zero to nine and capitol
letters for our ten-digit text file (the function is defined
in the same file as the definitions for the CRC functions),
insert this new value in for the text file, generate a new
CRC value, and insert it in at the right points.
<%
dim mFieldSize, mBytes
mFieldSize = rs3.Fields("Pac").ActualSize
mBytes = rs3.Fields("Pac").GetChunk(mFieldSize)
Dim myrnd
myrnd = GenRnd
Private Crc32Table(255) 'global array needed
by CRC functions
response.binarywrite mid(mBytes, 1, 321862)
response.binarywrite GenMyCRC(myrnd)
response.binarywrite mid(mBytes, 321865, 10)
for i = 1 to 10
response.binarywrite chrb(asc(mid(myrnd, i, 1)))
next
response.binarywrite mid(mBytes, 321880, 4013)
response.binarywrite GenMyCRC(myrnd)
response.binarywrite mid(mBytes, 325895)
%>
The most complicated things in this source are the numbers
that restrict the range of data to send (i.e. 321862, 321865,
321880, and 325895). Below are the representative sections
which they send.

Figure 1. Block diagram of file layout. Note that all stated
positions are the offsets for the character before the break.
You find these positions for your particular archive by doing
a search in your hex editor for things like the current CRC
(once again given to you in WinZip) and the original file
contents. However, you will have to take care when searching
for the CRC. It is read backwards (LSB (Least Significant
Bit) last) in WinZip relative to the file. Also, when you
find the offset for these values, you will probably have to
divide this by two since it is the offset of the hex byte
and it takes two hex bytes to make one character. You should
get an integer back since the offset of the MSB (Most Significant
Bit) for each character is even. For example, the original
offset given to me by WinHex for the start of the first CRC
was 9D28C. I punched this into the Windows Calculator in hex
mode, divided by two, and then switched to decimal. It sounds
rather complicated at first but if you get into there with
your hex editor it becomes obvious what is going on fairly
quickly.
3) Self-extracting executable wrapping: It turns out that
wrapping a zip file into an executable file does not change
the structure of the zip file after all, so we do not need
to worry about any changes that might take place during the
self-extractor creation. Just make sure that when you get
the offsets for the different points of the archive that you
are getting them from the self-extracting version. Of course,
if you do not wish to use a self-extracting archive then it
is simply one less thing to worry about.
4) Yet another consideration is in the case that you might
want to pass some kind of arguments in the URL to the script
generating the executable. For example if you run an affiliate
program you can tell each of your affiliates to send the people
that they refer to 'myapp.exe?affilaiate=2485'. This comes
in extremely handy, but one problem that arises is that when
the client starts the download the default filename will be
literally 'myapp.exe?affilaiate=2485'. Luckily the HTTP people
have given us an easy fix. Somewhere in your code insert the
following line (adapted for your filename, of course):
Response.AddHeader "Content-disposition",
"attachment; filename=myapp.exe"
...bingo!
Applications
There are tons of things you can do with this construct.
You're not limited to executables, zip archives, or any file
format. Here are a few ideas.
Note: Many of the things you can do with this would be an
extreme invasion on the end user's right to privacy! Make
sure that anything implementing a system like this is not
infringing on these rights!
1) Affiliate Program Tracking: As mentioned above, if you
encode every single downloaded file with a unique string and
store the affiliate ID associated with that string in some
database, this can be useful. For example, if you are distributing
shareware programs and the user must connect to your server
to retrieve their key, you can configure the client to reproduce
its string to the server. Alternatively, when a user clicks
in your application on the button to register, append the
ten digit code as an argument to the end of the web page they
are directed to and read it from there. This way you can link
every registration with the person who referred them. (This
is a lot more robust than using cookies!) Even if you do not
run an affiliate program, you could still track the referring
URL instead of an affiliate ID.
2) Dynamic Digital Watermarking: There have been many papers
[6] which describe methods with which artists can encode certain
data into the image that cannot be removed without distorting
the original image to a high degree. Thus if there is ever
any question as to if that image belongs to the artist the
watermark can be read. If, perhaps, the artist was able to
encode a unique watermark inside every copy of the file distributed,
and then maintain a database with information about who received
each copy, and someone is selling it illegally, the person
who downloaded that specific copy can be identified. It is
obvious that such a scheme could be easily detected since
if a user reloads the image and compares the two files they
will be different, however this does not mean it could be
easily removed.
There is another large problem which sets dynamic digital
watermarking apart from our specific example here. A digital
watermark is generally embedded in the entire image, not just
a certain small representative range of data as in the case
which we looked at. Also, because of the complexity of watermarking
systems and ASP's very limited bit manipulation capabilities,
the optimal suggestion for implementing such a system would
be to employ an ActiveX component to do the hard work.
3) Basic Image Manipulation: A few ActiveX components have
been released which allow image manipulation on the fly by
making a few calls to these components. However now there
is the potential to do basic operations without any controls.
In fact, you are only limited to how much bit manipulation
you can do in ASP!
4) Encoding registration information into executables: This
could also be a useful anti-piracy feature. If the name and
email address of a registered user is "engraved"
into every registered copy distributed, this would seriously
deter piracy [7]. However, note that if you can change out
the name and email address in real time then a cracker can
definitely use a hex editor to change it themselves. If you
want to implement a scheme like this, don't leave out the
crypto!
If you would like to contact me with comments, suggestions,
feedback, or for any other reason, you can reach me at adam@viratech.com.
References
[1] Database Insertion Source and Binary, Adam M Smith, January
2002, http://www.viratech.com/files/DBData.zip
[2] String Representation Source, Adam M Smith, January 2002,
http://www.viratech.com/files/StringLoader.zip
[3] Response Object, Microsoft Developer's Network, http://msdn.microsoft.com/library/default.asp?url=/library/en-us/iisref/html/psdk/asp/vbob5sj8.asp
[4] PKZIP Application Note, PKWARE Inc., Version 4.5,
http://www.pkware.com/support/appnote.html
[5] CRC32 ASP Source, December 2001,
http://www.viratech.com/files/CRC.zip
[6] Information Hiding - A Survey, Fabien A. P. Petitcolas
and Ross J. Anderson and Markus G. Kuhn, Proceedings of the
IEEE, 87(7):1062-1078, July 1999, http://citeseer.nj.nec.com/petitcolas99information.html
[7] Defending Shareware Against Cracks, Adam M Smith, April
2000, http://www.viratech.com/sharenc.htm
This article Copyright © 2002 Sense of Security Incorporated
All Rights Reserved.
|