Algorithms

12/26/2007 @ 01:15:27

I will be using this semester's CompSci workshops as a way to teach an Algorithms course and its applications to Quant Finance. I want to hit the ground running with our first workshop, so I am posting my first lecture (and the intro lecture) slides online...I know! I didn't procrastinate!? Surprising, yes. please send me feedback. I need to know what you like/dont like before I start teaching so that I can pack in as much quality as is possible. Enjoy!

Intro Lecture
Lecture 1

Grades...are not important?

12/14/2007 @ 06:53:44

So, it's 6:42am and I've been up all night stud...who am I kidding? I've been installing Office 2007 and reading funny comics -- shit. Whatever, some things in life are more important than grades. I often do random things such as start a new programming language when I have finals. Why? It's really the only "free" time I have. All of my time during the school year (since Freshman year) has been tied up with clubs/jobs/work/schoolwork; finals are the only free time I actually have -- finals, and breaks. I always see these these moments as opportune for my delving into the arcane wealth of knowledge that the internet has.

Closing notes on this semester from a computer-science perspective: next semester will really rock your socks as I believe we have covered the foundations for what you need to program some nifty things. I hope to go through some basic C#/C++, Advanced Perl, Advanced VBA, and cover MATLAB this semester. I STRONGLY suggest you to work on your programming over break. As I just mentioned, you won't find too much free time during college if you make the most of it...Carpe Diem guys. Here are some comics to keep you up-to-speed. Lastly, play video games (Gears of War, Bioshock, The Orange Box, Call of Duty 4 are some of my current hotshots). Regardless of what adults may say, these really create a sense of problem-solving and teamwork (don't laugh!) aspects that are valuable with quant stuff. So serious. Ok, now comics! (NONE ARE MINE, all from random, funny sites I scour) -->

Comic 1
Comic 2
Comic 3
Comic 4
Comic 5
Comic 6
Comic 7
Comic 8
Comic 9
Comic 10
Comic 11
Comic 12

Thoughts on Programming - by Varun

12/02/2007 @ 01:28:55

Haven't posted in a long time...Found this comic quite entertaining.
Punishment So, I want to discuss some time-saving tricks with VBA-Excel. Mainly:

  • When to use VBA
  • Data From Workbook
  • Arrays
  • Hashes
  • Error Handling
  • File Input/Output (I/O)
  • Etiquette

When to use VBA

So many times over I have thought, "I want to use VBA for this thing!" I'll explicitly say, that is the exact opposite of what you should be thinking. Try as hard as you can to not use VBA with your Excel sheets. Why? While VBA is very clever, many people do not trust other's code and will Disable VBA on the opening of a workbook and will attempt to check it to ensure the safety of their computer. Especially in business, most individuals do not know coding and it's always easier to trace things just on spreadsheets (for bankers) than to delve into code. KISKIS - Keep it Simple, Keep it Safe. Now, if you had to resort to VBA, below are some tips that will make life easier for you and for other's who read your code.

Getting Data From Workbook

This section is pretty much "borrowed" from Dermot Balson's (my favorite VBA guru) website.
Resizing
You will always have a bunch of stuff on your spreadsheet, and invariably will eventually have to write some code to deal with that data (Example: you have two columns of numbers and you want to find a weighted sum of the first column using the second column as the weights). Sure, you can do that example by hand, but let's assume you have to use that elsewhere and compute it often, so you have to use some code. What is the best way to deal with that data? What first comes to mind is something like:
Public Function WeightAverage(rng1 As Range, rng2 As Range)
	Dim i!, wgtAvg#
	rng2Sum = Application.WorksheetFunction.Sum(rng2)
	For i = 1 to rng1.Count
		wgtAvg = rng1.Cells(i,1).Value * rng2.Cells(i,1).Value / rng2Sum _
			+ wgtAvg
	Next i
	WeightAverage = wgtAvg
End Function
But let's say you had 65536 numbers in each column, do you really want to hit the worksheet 2*65536 times just to calculate the weighted average? This ties back to my first point about what you really need VBA for; however, for the sake of this example we should find a solution to hitting the worksheet less than that many times. For this, we have the Resize element:
Public Function WeightAverage(rng1 As Range, rng2 As Range)
	Dim i!, wgtAvg#, r1r!, r1c!, r2r!, r2c!, V1 As Variant, V2 As Variant
	r1r = rng1.Rows.Count: r1c = rng1.Columns.Count
	r2r = rng2.Rows.Count: r2c = rng2.Columns.Count
	V1 = rng1.Resize(r1r, r1c)
	V2 = rng2.Resize(r2r, r2c)
	rng2Sum = Application.WorksheetFunction.Sum(rng2)
	For i = 1 To rng1.Count
		wgtAvg = V1(i, 1) * V2(i, 1) / rng2Sum + wgtAvg
	Next i
	WeightAverage = wgtAvg
End Function
You can actually use the Resizing tool to read AND write from the worksheets. If you have an array of a certain size and you know it should go into a certain sized named range:
Range("NamedRange").Resize(100,20) = Array
which is SO much faster than one's first instinct:
For i = 1 to Array.Rows.Count
	For j = 1 to Array.Columns.Count
		Range("NamedRange").Cells(i,j).Value = Array(i,j)
	Next j
Next i
Don't Mix Excel & VBA
What does that mean? If you are working with data and you need to perform a function, like summing a range, do NOT use Application.WorksheetFunction. It is very slow compared to an efficient algo you can write using VBA. Rather than using
Application.WorksheetFunction.Min(a,b)
to find the minimum of two numbers, write your own:
Function Min(a as Long, b As Long) As Long
	If a<b Then Min = a Else Min = b
End Function
Memory Management
You have a limited amount of RAM and that memory will be used for allocating variables into the memory blocks when you run VBA. There are a two major things you have to keep in mind when you deal with memory management: scope and type. Let me explain, scope is how long an initiated variable lasts. What does this mean? Since VBA uses, to some extent an object-oriented model (it's not truly object-oriented because VBA doesn't allow polymorphism [inheritence of upper-level properties] with classes...), we have objects in VBA. These include all the things you use in VBA: Range, Application, Worksheet...even Integer, Double, Single. So, when you create an instance of a variable, Dim i As Integer, you have created a copy of an Integer and called it i. Now this variable has a scope (lifetime). When the top-level routine ends, the variable (and thus, the memory block to which it is allocated) is freed up (in C++ you are used to malloc, free, realloc, calloc, mallopt, mallinfo, alloca, or valloc) automatically. Enough of the techno-babble; what does this mean for us?

If you have to create a subroutine and inside you need to pull data from a worksheet that is ONLY used in that subroutine, Dim(ension) your variable inside that subroutine so that when it ends, the block will be freed:
Sub INeedAnArray
	Dim i%, A%()
	For i = 1 to 1000
		ReDim Preserve A(1 to i)
		A(i) = i
	Next i
End Sub
When the Sub ends, both i and A will be freed up. Below is a list of variable types in VBA and how much space they take up in memory:
Bit            - 1 bit (0 or 1)
Byte           - 1 byte (0 to 255) = 8 bits
Boolean        - 2 bytes ("True" or "False")
Integer        - 2 bytes (-32768 to 32767)
Long (int)     - 4 bytes (-2,147,483,648 to 2,147,483,648)
Single (fp)    - 4 bytes (-3.402823e38 to -1.401298e-45 and 1.401298e-45 to 3.402823e38)
Double         - 8 bytes (e308 to 3-324)
Currency       - 8 bytes (-922 Trillion to 922 Trillion)
Decimal        - 14 bytes
Date           - 8 bytes
Object         - 4 bytes
String (var)   - 10 bytes + length of string (can be from 0 to ~2 billion)
String (fix)   - length of string (1 to ~ 65,400)
Variant (num)  - 16 bytes (any numeric up to double)
Variant (chr)  - 22 bytes + length of string (can be from 0 to ~2 billion)
User-Defined   - Number required by elements
Arrays         - 20 bytes + 4 bytes for each dimension
Dermot Balson also did a study to find out which variable types were fastest with calculations in VBA. While one would expect the smaller variable types (Integer and Byte) to be fastest, it is actually the native processor-type variable (example: For most computers, 32-bit processor = 4 bytes = Long, Newer computers, 64-bit processor = 8 bytes = Double) that is the fastest because the computer can allocate the variables into the memory blocks without any down/up conversion into different non-native types:

The last thing I want to say is regarding the shortcuts you can use when defining variables. You will see that I do it often because I'm a prick...Actualy, I'm just lazy and it makes my VBA code look 10x cooler:
Dim i As Integer   - Dim i%
Dim i As Double    - Dim i#
Dim i As Single    - Dim i!
Dim i As Long      - Dim i&
Dim i As String    - Dim i$
Dim i As Currency  - Dim i@
Dim i() As Integer - Dim i%()
...

Arrays

Size
Let's get the basics out of the way. Imagine that an array is like a matrix, it has rows and columns and to reference a particular entry, you have to reference the position the entry is in the matrix (i.e. for the 2nd row and 3rd column's entry, you say Array(2,3)). The Array can start at any numbering system, but it has to be indexed by whole numbers (i.e. you can say, Dim A(-1 to 5) As Integer will give you 7 entries that must be integers and are indexed from -1, 0, ..., 5). The Array itself needs 20 bytes + 4 bytes for each dimension added + the number of entries and their corresponding memory requirements (i.e. Dim A(1 to 6, -2 to 4) As Double will be 20 for the array + 4 * 2 for 2 dimensions + [ (6-1+1) + (4--2+1) ] * 8 for all the entries we're storing as Doubles = 132 bytes). As you can see, the size of an array can grow fast, but no need to worry because if you use the variable types that you need, you should be fine.
Nesting You can store anything inside an array, even another array:
Sub ShowTwoArrays()
	Dim A(1 to 1) As Variant
	Dim B(1 to 2) As Integer
	B(1) = 5: B(2) = 6
	A(1) = B
	Debug.Print A(1)(1) & vbNewLine & A(1)(2)
End Sub
Storing
As In the previous section, I had some code that put numbers 1 to 1000 into an array (when I was talking about scope). For and While loops are particularly useful for going through lists (arrays) and doing something with each entry (like summing, or moving averages). You can also loop through arrays when you need to pull data out by using the LBound() and UBound() functions, which give you the lower and upper index numbers, respectively:
Sub StorePrintArray()
	Dim i%, c%, A%()
	c = 500
	ReDim A(1 to c)
	For i = 1 to c
		A(i) = i
	Next i
	For i = LBound(A) to UBound(A)
		Debug.Print A(i) & ", "
	Next i
End Sub
Dim, ReDim, ReDim Preserve
Dim(ension) initiates an instance of a variable type. If none is indicated, Variant (20+ bytes) is used -- not very efficient. You can ReDim(ension) an array to a different size, but that will ERASE THE DATA inside the array unless you use ReDim(ension) Preserve (data). Confused? Look:
Sub GinAndTonic()
	Dim A%()
	ReDim A(1 to 100)
	For i = 1 to 100
		A(i) = i
	Next i
	''--> Don't know if I've mentioned it yet, but ' starts a comment
	''--> Now we have an array with 100 elements
	''--> Let's say I wanted to add a 101st element...
	ReDim A(1 to 101)
	A(101) = 101
	''--> Guess what? You lost entries 1 to 100
That line that said "ReDim A(1 to 101)" should've read "ReDim Preserve A(1 to 101)". That saves everything in A and adds on a 101st entry.
Option Base
You can put Option Base 1 at the top of your module to start your arrays' indices at 1 instead of the standard 0. This also affects the arrays when you use the .Resize method.

Hashes

Ahh, my favorite trick. Ok, let's look at the following example which presents a bunch of common problems. Let's say you have an array of 10,000 names (it's a string array, A$(1 to 10000)). Here's the crux: You want to find out whether a name exists in the array before adding it to the end. What choice do you have but:
'' Assume that we have an array, A defined at a scope higher than the Sub
Sub AddNewName(name$) As Boolean
	Dim i!, exists As Boolean
	exists = False
	For i = LBound(A) to UBound(A)
		If (A(i) = name) Then exists = True
	Next i
	If Not (exists) Then
		ReDim Preserve A(LBound(A) to UBound(A) + 1)
		A(UBound(A)) = name
	End If
End Sub
So what's wrong with this? Well for one, it's highly inefficient. You have to loop through the entire array to find a name each time. Well maybe this isn't so bad...We're just adding one name, right? *wrong* What if you have a list of names in a separate array, 100 names, that you want to check and add? Ahh...Now you must be thinking, "shit...". Well let's take a step back. What if you could index an array by strings?. Then all you would have to do is check if the index exists, and if not, add a new index? That almost "reverses" the role of what we used the array for before --> Instead of storing names, we store indices and instead of checking through all the entries, we just check for a single index! This reduces our run-time from O(n) or worse to a constant, 1 (for our purposes at least). So how do we do this?
Dim d As Dictionary
Set d = New Dictionary
What? you don't see this as a variable type when you type? Ok, into your VBA pane and go to Tools --> References, and enable Microsoft Scripting Runtime. It uses the scrrun.dll library, which has the Dictionary element. This is basically the hash we are talking about. It is comparable to a hash element in Perl (my %h).
'' Again, assume we have a higher-level dictionary, d, defined already
Sub AddName(name$)
	If Not (d.Exists(name)) Then d.Add Key:=name, Item:=1
End Sub

Error Handling

Why?
When you run into errors, you can either see this retarded screen (which is highly unprofessional, semi-useful though it may be):

or you can see this nice MsgBox:

Not only for professionalism, but it's a cleaner error message and if you choose to press "End" on the default one, your code will stop at the breakpoint, even if you had some environment variables changed (I will talk more about that in Etiquette & Efficiency).
How?
Now the ideal error handling code (almost ideal, Dermot would add in an environment state for CalculationMethod as well...):
Sub WhateverSubYouHaveHere()
	'------------------------------------------------------------------
	Dim EE As Boolean, SU As Boolean, DA As Boolean
	EE = Application.EnableEvents: Application.EnableEvents = False
	SU = Application.ScreenUpdating: Application.ScreenUpdating = False
	DA = Application.DisplayAlerts: Application.DisplayAlerts = False
	AC = Application.Calculation: Application.Calculation = xlCalculationManual 
	On Error GoTo err_hdl
	'------------------------------------------------------------------
	...
	[Rest of the code in your sub goes here]
	...
	'------------------------------------------------------------------
	Application.EnableEvents = EE
	Application.ScreenUpdating = SU
	Application.DisplayAlerts = DA
	Application.Calculation = CA
	Application.StatusBar = "Ready"
	Exit Sub
err_hdl:
	Application.EnableEvents = True
	Application.ScreenUpdating = True
	Application.DisplayAlerts = True
	Application.Calculation = xlCalculationAutomatic
	Application.StatusBar = "Ready"
	Call ErrMsg
End Sub

'' When an error occurs, Excel sets Err.Number  0, and this will trigger a message
Sub ErrMsg()
	If Err.Number  0 Then MsgBox "Error " & Err.Number & vbNewLine _ 
		& Err.Description & vbNewLine & Err.Source
End Sub

File Input/Output (I/O)

This is for reading a text file and/or writing to a text file. This is very useful with XML text files, or if you want to download some data and open it / parse it with Excel (although I suggest Perl for it), or plethora of other reasons. How?
Best Way

Etiquette & Efficiency

Why?
Syntax

Introduction

03/10/2007 @ 08:19:58

Wondering what language to learn? If you have some programming experience and are comfortable with the basic flow structure of programming, then try learning C++ or Java. You can pick up a book from B&N or simply find a good, free tutorial online. If you have no experience whatsoever, you can still do the above, or like me just start picking up whatever you can get your hands on easily.

I didn't really know any programming when I started in 7th grade. I went to some dorky computer camp and learned Quick Basic. I was actually quite good at it and got a trophy...the bane of my career...Anyway, I started HTML hacking as everyone did when the MySpace craze hit during high school and I believe that pushed me into mastering web languages (PHP, HTML, DHTML, JS, CSS, XML, RSS). Unfortunately, because of my newfound liking for seeing results quickly with web languages, I never really had the desire to pick up C++ or Java during high school. This is my only regret of learning web languages first. I started on VBA during my junior year for a job, and it was a great idea to learn VBA for Excel and Access (you will get a lot of props from companies for knowing it).

I'm really just babbling here, because I want to see if WP parses my LaTeX correctly... \int_{a}^{b}\frac{1}{2}x^2dx=\left(\frac{1}{6}x^3\right)_a^b=\frac{b^3-a^3}{6}

-Varun

Quick-Quotes

IndexCurrentChgChg(%)
Dow Jones9015.10+62.21 +0.69%
Nasdaq1652.380.00- 0.00%
S&P 500934.700.00- 0.00%
10-Year T-Bill2.5050.00- 0.00%