SyntaxHighlighter

Wednesday, May 26, 2010

Another late night learning more about F#

Just thought I'd share a few minor things I'm working on in my spare time. The first little project is a visualization project that I'm doing for my wife.  She's collecting some information from a large group of schools in the district and compiling it all in an excel file. So I thought I'd write some F# to scrape the data from the excel file and perform some analysis on it and then visualize it in Silverlight.
The silverlight part hasn't started quite yet, but I should get to that tonight or tomorrow. Below is a screen shot of the excel document in the early stages of compiling the data manually.














The data is basically about competed teams at a state competition over the past 5 years broken out
by category. My job is to pull all this information and display by teams size (which is the red, green, and yellow on the right hand side of the document). Below is some of the F# I've come up with so far.

#r @"C:\Windows\assembly\GAC\Microsoft.Office.Interop.Excel\12.0.0.0__71e9bce111e9429c\Microsoft.Office.Interop.Excel.dll"
#r @"C:\Windows\assembly\GAC\Office\12.0.0.0__71e9bce111e9429c\Office.dll"

open Microsoft.Office.Interop.Excel
open System
open System.Runtime.InteropServices

// Create new Excel.Application
let app = new ApplicationClass(Visible = true)
//Create the workbook path
let workbookPath = @"C:\Research.xlsx"
// Open the workbook from the workbook path
let workbook = app.Workbooks.Open(workbookPath)
// Get the worksheets collection
let sheets = workbook.Worksheets
// Grab the worksheet we need to pull data from
let worksheet = (sheets.[box 1] :?> _Worksheet)
let TeamSizeByCategory (category: int) (startColumn: string) (endColumn: string) =     // Pull the data from each row and yield it into a list of obj[Dance,Military,Hip Hop,Kick,Pom,Show]
    // I-N (2006) P-U (2007) W-AB (2008) AD-AI (2009) AK-AP (2010)
    let data = [ for i=5 to 80 do
                    let row = (worksheet.Range(String.Format("{0}{1}",startColumn,i) , String.Format("  {0}{1}",endColumn,i)).Value2 :?> obj[,])
                    yield [row.GetValue(1,1)
                           row.GetValue(1,2)
                           row.GetValue(1,3)
                           row.GetValue(1,4)
                           row.GetValue(1,5)
                           row.GetValue(1,6)]]
    let CategoryFolks =
        data
        |> List.filter( fun i -> i.[category] <> null)
    let CategoryFolksSmall =
         CategoryFolks
         |> List.filter( fun item -> (item.[category] :?> double) <= 15.0)
    let CategoryFolksMedium =
        CategoryFolks
        |> List.filter( fun item -> (item.[category] :?> double) >= 16.0)
        |> List.filter( fun item -> (item.[category] :?> double) <= 29.0)
    let CategoryFolksLarge =
        CategoryFolks
        |> List.filter( fun item -> (item.[category] :?> double) >= 30.0)

seq {
         if CategoryFolks.Length > 0 && CategoryFolksSmall.Length > 0 then
          yield "Small", Math.Abs(float CategoryFolksSmall.Length / float CategoryFolks.Length) * 100.0
         if CategoryFolks.Length > 0 && CategoryFolksMedium.Length > 0 then
          yield "Medium", Math.Abs(float CategoryFolksMedium.Length / float CategoryFolks.Length) * 100.0
         if CategoryFolks.Length > 0 && CategoryFolksLarge.Length > 0  then
          yield "Large", Math.Abs(float CategoryFolksLarge.Length / float CategoryFolks.Length) * 100.0
       }


//Simple type to encapsulate a single year of data for all categories
type AnnualStatistic(year,data) =
    member x.Year = year
    member x.Data = data


//I really want the ability to persist the data from the excel file into a Db40 object database, which is something I've blogged about before.
let dbpath = "C:\Users\cfrederick\Documents\Visual Studio 2010\Projects\ResearchVisualization\db"
let persist (year: string) startColumn endColumn =
    let data = [ for i=5 to 80 do
                    let row = (worksheet.Range(String.Format("{0}{1}",startColumn,i) , String.Format("  {0}{1}",endColumn,i)).Value2 :?> obj[,])
                    yield [row.GetValue(1,1)
                           row.GetValue(1,2)
                           row.GetValue(1,3)
                           row.GetValue(1,4)
                           row.GetValue(1,5)
                           row.GetValue(1,6)]]

    /// Db40 Implementation to persists historical data from excel
    //the use keyword is synonymous to using in c#
    use container = Db4oFactory.OpenFile(Path.Combine(dbpath, "Researchdb.yap"));
    container.Store(new AnnualStatistic(year,data))
    container.Close
   
//Finally a helper function to retrieve a given years worth of data for all categories from the Db4o database.
let AnnualStatistic year =
    use container = Db4oFactory.OpenFile(Path.Combine(dbpath, "Researchdb.yap"));
    let results =
        seq {
                let r = container.Cast().Select(fun i -> i)
                for i in r do
                    yield i
            }
        |> Seq.filter(fun item -> item.Year = year)
        |> Seq.head
    results



The more I use F# the more I love it. :-)
-Develop with Passion

No comments:

Post a Comment