Archive

Archive for the ‘Technologies’ Category

Backup Issue

April 28th, 2010 Frank No comments

I needed a method to backup my web & database servers; while there are a ton of great options out there such as rsync and unison, I found it difficult and time consuming to get the Linux server to sync to Windows*. My main problem was actually more ssh than anything else. Either I don’t know enough about it or the software available on Windows isn’t as developed. (I’d love others’ perspectives on this…)

Anyway, for the time being I decided that the easiest thing for me to do is to write my own little (python) program to handle the backup and automatic retrieval of said backup archives. I don’t push or pull a lot of data between the computers, so pulling full archives though the network (as opposed to the changed bits that rsync or unison would do) doesn’t matter so much to me. The most important part is that the backup is current.

The programs are by no means complicated or unique; in fact, they might be the simplest backup programs ever. I’ve posted the source code at BitBucket.org and can be viewed at: http://bitbucket.org/frankv01/vpsbackup

If you either end up using it or deriving a new backup solution from it, please drop me a line. I’d love to know that.

* You might wonder why someone who writes-on (lives and breaths) open source would use windows…  The of matter of fact is my professional work (as of this writing, at least) is done on Windows — SQL Server and the .Net to be specific. I’ve never said anything bad about Microsoft nor will I. I believe that Open Source and traditional closed-source has their place. A software developer  can run their coded bases as desired. I, for one, prefer open source to closed-source — paid or otherwise.

Move a MySQL database – How To

April 13th, 2010 Frank No comments

I had to move one of my databases from the main OS drive to a different partition. Being that I’ve not administered my own mysql database before — nothing beyond SQL level administration — I needed to find some sort of guide. I found the following and was quite pleased; I thought that other might be able to use this.

How to move MySql database to another drive or partition.

While this post is not strictly about the internal workings of an open source software package, it does have to do with the administration of notable open source software and hence, worth this blog’s posting.

The one thing to note is that the instructions seem out of date. The instructions were written using Ubuntu Server 7.10 (Gutsy Gibbon) as a basis. The instructions worked for me except for the apparmor note below. I’m currently using Ubuntu Server 8.04 LTS.

You many need to modify apparmor’s configuration file and then restart the service.  You can see the conversation in the comments (on the article) about it, but the brief version of it is:

  1. Complete the main instructions first (Understand the entire article before starting.)
  2. Open nano /etc/apparmor.d/usr.sbin.mysqld
  3. Modify entries pointing to the old location to point to then new location. This will look like: /var/www/mysql_datadir/ r, /var/www/mysql_datadir/** rwk,
  4. Restart apparmor which can be done via: /etc/init.d/apparmor restart

Again, the full article is: How to move MySql database to another drive or partition.

http://www.ubuntu-howto.info/howto/how-to-move-mysql-databases-to-another-location-partition-or-hard-drive

Categories: Article, MySQL Tags: , ,

Processes and Threads

April 1st, 2010 Frank No comments

I was having a conversation with a friend of my mine; we were discussing how computers organize the processes and threads. Essentially, how the processors (cores, actually) handle the various tasks required of them.

This led me to my usage of Threads in the tOSU-WebServer and made me feel that some might benefit from a practical, applied overview of the topic.

What are Threads

A computer consists of many components; the components are numerous and truly vary from computer-to-computer. A component that is key to a computer is the central processing unit. A while ago, when I started programing it was almost necessary to understand how a CPU works. These days, this isn’t so much the case.

The tOSU-WebServer utilizes threads; I’d suspect that most web servers would require this though I don’t believe this is a strict requirement.

In order to understand tOSU, however, basic knowledge of threads will certainly be useful. To start, I think it is helpful to understand that computers can only accomplish so much at a given time. At one point, computers could only accomplish one thing at a time; this is quickly fading away as mutlicore processors are quite common these days.

But what does multicore actually mean? A core, in a general sense, is the component that does the actual calculations and logic. When you write: if( j > 0 ) i = j + 2;  A ‘core’ actually handles both the logic and mathematics. This means that a dual core machine can do two of these concurrently, whereas a quad-core can do four of these at any given moment. More is better, but obviously there are dependencies upon previous steps.

A simple example

In order to help initial understanding of a program with multiple threads, I’ve put together a small Java “Hello,World” like program that utilizes multiple threads. This code is basic but should introduce the topics nicely. If you are new to threads, analyzing and running this program on your local computer will be greatly helpful.

The source code can be view directly at: MultiThreadedHelloWorld.java

The output for the program ends up as:

Hello World!
-> Each Thread started. Waiting for all threads to complete
Hello, Becky
Hello, Frank
Hello, Heather
Hello, Jim
Hello, Evan
Hello, Charlie
Hello, Alex
Hello, Greg
Hello, Irene
Hello, Doug
-> Done, exiting

Note that the names were load loaded in to the ArrayList in alphabetical order. This is one of the things of threads.

tOSU-WebServer

Anyway, using that as a basis, we can move on to the actual tOSU-WebServer. The server uses different threads for different tasks. Aside from the processing benefits, this separates code that responds to the client from code that handles incoming connections. The two are different tasks. The code that waits for incoming connections, will spawn (create) a thread to handle the incoming connection.

So, what handles what? The class “MyWebServer” (view) (in the ui package) handles the incoming connections. The “HttpWorker” (view) (in the http package) handles the client request. Lets take a brief look at the code that is actively starting threads.

77
78
79
80
81
82
83
while( true ) {
Socket sock = servsock.accept();
if( _mode == ServerMode.WebServer )
WorkerFactory.newServerWorker(sock, _pathToServeFrom, _dPrinter).start();
else
WorkerFactory.newListener(sock, _dPrinter).start();
}

This snippet is lines 77 though 85 in file MyWebServer.java tagged as v0.5.2. There is actually much more here than we needed but using this actual, running code should make the example easier to follow. Either method, newServerWorker() or newListener() return a class that is a instance of Thread. (Thread in the Java API). Once we call the start() method, the computer returns to line 78 to wait for a new incoming connection and concurrently the thread starts executing. If you recall, it executes the content of the instances’ run() method and this runs regardless if a new connection is received and additional start() methods are called. Each connection is handled independently of the other even if all responses go back to the client.

That is pretty much all there is to it. From the start, it might have sounded complex, but in reality it is simpler. Coordinating large, multi-threaded programs, however, can become complex — particularly in regards to passing data and coordinating such handling. This topic is an entirely different blog entry. Thank you for reading. Please leave comments below.

Presenting tOSU Web Server – An open source web server

March 20th, 2010 Frank No comments

I’ve just finished my Winter 2010 term for my graduate degree. I took two classes this term, SE-450 and CSC-435. Both classes were great, but taking them concurrently was not a great idea. Nevertheless, I have something to share which is ultimately a derivative of the two classes. One of the project assignments in CSC-435 – Distributed Systems I – was to create a web server. We were given the basics of how a web server and client works, but then left to our own devices to gather the HTTP response codes and other such information. My intention is to share a good portion of the basics here but then also my web server (slightly modified for this site).

Background

Web servers (and browsers) work on top of basic sockets. While this entry isn’t going to be a comprehensive introduction to the networking technologies involved, one area that is key to the web server is the idea of sockets. A socket is defined as a communication channel in which two programs can communicate. The communication takes place over ports.

Now, we need to understand how to use a Socket in Java which is the implementation language for the tOSU Web Server. Java (SE 6) has two implementations for a Socket. One is the ServerSocket and the other is a Socket. The former, waits for incoming socket connections. Essentially, it becomes the server. The socket, on the other hand, is a incoming connection; it becomes the communication channel between the client and the server.

The high level idea of creating a web server is to create a ServerSocket instances and wait for incoming connections. Assuming it receives a properly formatted request for a HTTP server, we handle the request and return data — web pages — to the client. The question is, what constitutes a valid HTTP request.

The HTTP Request

HTTP is nothing more than a protocol. We’ve all heard this, I’m not sure we all know what this means. A protocol is merely a set of rules. I don’t believe that a protocol is anything more or anything less.  The HTTP protocol is actually quite comprehensive but creating the tOSU-WebServer has taught me that we do not need to implement the entire protocol for a  pedagogical web server application. We simply need to provide the basics, perhaps a little more, and it’ll work. This is what my web server represents.

HttpFox results for Cat.html

HttpFox results for Cat.html

I learned about the protocol in two ways, neither had to do with reading the actual published documentation. I utilized a Firefox Plug in called HttpFox to review the server/client communication between an existing web server (the Apache server for this site) serving a simple HTML page and I created (as an assignment) a “Listener” / echo program. The listener program is built-in (as a switch) to the tOSU-WebServer. I’ll cover utilizing this in the following sections. The screen capture on the right is my results for retrieving a HTML file called “cat.html (click on the image to zoom-in).

The top row is a single request; if you enable Httpfox for a request on this site, you’ll notice that several requests are made. Each resource (html, css, images) become a request. The left side is the request header (for the selected request) or what Firefox sent to the web server. The right side is what the web server responded to Firefox. Our web server implementation must accept and read in the request header, process it, and along with the html page / data return the response header to the client.

As I write that, it sounds like a lot but it really isn’t hard to do. There are only a few required items on each side. The important line in the request header (from the client) is the “(Request-Line)” and on the client side the “(Status-Line)”.  The request-line is what the browser is requesting — the file. The status line is the response. You can view is a list of common status codes on Wikipedia but again, only a small subset is pertinent to the implementation of a simple web server.

Headers as Implemented

The headers that tOSU-WebServer must read and generate is quite straight forward.

The line we must process from the client browser is the request line which looks like GET /overview-summary.html HTTP/1.1.  The GET indicates that the browser wants to get a file, the /overview-summary.html is the file we want and the HTTP/1.1 is the protocol the client is using — the format of the request. This single line is the only relevant line we are interested in. The client sends more but tOSU-WebServer ignores the remaining items.

The web server must respond with a few more lines but it still is not extensive. The first line, as previously mentioned, is the status line. This is formatted as HTTP/1.1 200 OK. The 200 and OK can be various numbers and statues, but the idea holds. The HTTP/1.1 is the response protocol.

The next two lines is Content-Length: 500 (where “500″ is the size in bytes) and Content-Type: text/html where text/html is the appropriate MIME type.

Each one of these must be terminated with a carriage-return and then newline. In Java, this is delimited by \r\n.  Finally, to indicate that headers are complete, we send \r\n\r\n. The browser would then expect the content.

tOSU WebServer

First, where is the code? I’ve placed the code on BitBucket. The BitBucket project path is: http://bitbucket.org/frankv01/tosu-webserver/overview BitBucket provides a software project with various services, one of which is a Mercurial based repository. The site also has the option to retrieve archived versions of the tip of the repository. This option currently exists on the page above on the far right called “get source“.

The command to clone the repository (full history) is:

hg clone http://bitbucket.org/frankv01/tosu-webserver/ tOSU-WebServer

This will give you a repository clone where you issued the command, called tOSU-WebServer; this is essentially the project’s name (for lack of a better one). Note: This article is being written against the tag “v0.5.x”. Once you clone the repository, run “hg update v0.5.1

While I’d love to review everything, including the architecture, this inaugural post can only include so much information. I figure the first aspect is understanding the overall architecture enough to looking though the code. Then we’ll take a look at the specific code segments that process incoming requests.

Architecture

As I stated at the start of this post, this program was developed while I attended an object-orientated architecture course and a distributed computing course. This combination made this program take on an architecture that is likely more complex than it needed to be, but is strongly OO in nature. This design led to a large number of classes but each with a finite task to accomplish. I beleive that this will make it easier to understand… once you can follow the design.  Please feel free to ask questions. I learn by teaching and I can only improve articles like this by receiving questions.

Package Layout & Design

I’ve used packages to organize the program; understanding these should make it easier to find what you might be looking for.

  • com.theOpenSourceU.webserver.arguments : A package to handle command-line argument/flag processing and parsing.
  • com.theOpenSourceU.webserver.debugutil : A package to handle text based debug and error messages.
  • com.theOpenSourceU.webserver.http : The core of the program, this contains the code that ultimately is the web server.
  • com.theOpenSourceU.webserver.ui : Contains the main executing class; the program to launch and manage the various pieces of the web server.

What main does

Since the goal is to understand how the program works, lets review what the program actually does. The file we are reviewing is the MyWebServer.java (in the ui package), which contains a class called (surprise) MyWebServer.

What the program basically does is:

  1. Process any given arguments, setting class level fields.
  2. Get a new instances of ServerSocket. When we construct the new instances, we give it the port (_port) and the queue size. Both values will be covered later.
  3. Next, we call servsock.accept() which is a blocking call; it will block the program until a connection is received.
  4. Once a connection is received (via the port) the program will receive an instance of that and stash it in sock.
  5. Depending on the server mode, either a new Server Worker will be created and started or a new listener. Each of these are different modes and are set via the arguments. Note that each one of these are a derivative of a Thread and hence we are starting new threads upon calling start()
  6. Go back to 3 to wait for another connection.

This is the gist of the programs flow. The details of handling the request are handled in the http package. We’ll review this package in the next section.

Implementation

The http package contains various classes, only a small subset is actually public.  We’ll review a few classes in the next few paragraphs however, the best way to review all of the classes is to generate the javadoc files and review those.

In the earlier section, we saw the class WorkerFactory. This is a class to generate appropriate instances of the two works contained in the package. A worker is a class derived from Thread that performs some task, in our case handle http requests. The two concrete classes that can be generated are HttpWorker and MyListener.

The HttpWorker class is the class of interest here. This class becomes the worker thread that handles the request sent to the server. Another way to put this is that this is what the client-browser is actually talking to, and not the MyWebServer instances. This is how the web server can handle several requests at once.

Since we are on it, why don’t we continue on from the HttpWorker class. The class extends Thread and we are implementing the run method. Let’s not go in to detail, but this is the code that processes the request and ultimately provides the content to sends back to the client browser.  Inside this method, we reference another factory — HttpContentFactory. This factory can provide implementations of HttpContent for a variety of files types, including css, html and a made up dynamic page. (Images weren’t working Status)

The contrast to HttpContent is the HttpClientHeaders instance. This represents what becomes the server response headers. This web server only supports a few codes (recall, not all need to be supported). The class HttpClientHeadersImpl provides support a 404 error, 500 error (internal server error) and 200 success status. The implementation details are not relevant to this initial introduction but it is important to know that the HttpWorker class can’t complete it’s job without an instance of this to report status (success/error) to the client.

From here, HttpWorker renders these two instances and sends the contents back to the original request.

More to Come…

In the details, the program does a lot more than what I’ve outlines in the last few sections. However, I suspect that wrapping your hands around these first sections can make reviewing the source code less intimidating.

If you have any questions or feel that the article above can be improved, please let me know via the comments. I hope to post more educational articles on tOSU-WebServer and I would greatly appropriate direction. If you are interested in a particular section, please let me know (again, via the comments). Oh, and don’t forget to follow the project on BitBucket.

The Tools of Open Source

October 27th, 2009 Frank 1 comment

As some readers will know, I’ve been working to study the architecture of FireFox. One thing I realize is that I’m not as familiar as I should be with some of the common open source tools, or what I consider to be the common open source tools.

Below is a list of tools that are worth learning (in my opinion). I’ve also included a short description of why it might be worth learning them. (The list is in no particular order)

If you have an interest in open source, it will not hurt to get a quick base understanding of each of these. I’m not saying you (nor am I trying to) become an expert in each of these tools. However, having a basic understanding of syntax and function should save time and headache while trying to understand a project.

  1. Linux: The concepts in and around Linux are often used on other open source products. I think Open Source developers tend to stick with using open source software. So, there is a link there.
  2. Bash: The de-facto standard shell for Linux (as far as I can tell). Knowing the basic usage of bash can save you time and confusion.  Certain scripts can depend on feature of your terminal interface. The Mac OS X ships with a version of bash, which is good to know…
  3. GCC: This is the GNU Compiler Collection and is often a requirement to build open source packages.
  4. Make: This is a part of the GCC but I want to make special mention of this because knowing how to read the script files and error messages can help diagnose an error.
  5. C / C++: Low Level libraries are often written in C or C++, even for an otherwise Java or Python based program.
  6. Python: Python is sometimes used in conjunction with Make to check for build dependencies, verify (build) requirements, or many other possible things.
  7. Perl: Often used like Python, from what I understand but I’ve yet to learn much about it.
  8. Subversion: This is the most common open source VCS software in use (based on my own observations)
  9. Mercurial: One of the two popular DVCS systems. I’ve noticed more and more open source projects switching to DVCSs, so a basic working knowledge of Mercurial and Git is helpful.
  10. git: The second of the two popular DVCS systems.

If you’d like to contribute to the list, please leave a comment below but please ensure you include a fair reason.

Using COALESCE to Build Comma-Delimited String directly in SQL

October 27th, 2009 Frank No comments

The following article covers a great should-know piece of information for any SQL Server developer. Though, comma delimited strings are not usually desired in database, it can be nice to receive data this way for things like embedding in to a query string or passing over a web service*.

http://www.sqlteam.com/article/using-coalesce-to-build-comma-delimited-string

*I don’t necessarily suggest the web service idea. I was just saying for illustrative purposes…

Categories: SQL Server, Tips Tags:

Python Memory Performance

October 8th, 2009 Frank No comments

I’ve been studying Python because it is extensively used in open source projects to script certain types of work or actions. I think Mozilla uses it to verify build time requirements, for example.

But, python itself is a full fledged programming language — not really a scripting language. In fact, Mercurial is written in Python which at first amazed me.

So, anyway as part of my pursuit of studying Firefox, I thought I should (at least) learn the basics of Python so that I could read any relevant scripts. Python is quite different from other languages I’ve encountered.

One thing I started to notice about python is that it seems to use more memory. I’m not sure, nor am I suggesting that it uses more than other languages. I just noticed that memory consumption seemed to grow rapidly when running a python script.

The following article seems to explain why. Instantiating a class is expensive. According to the article, a class is 336 bytes. Since in general, a class will use multiple other classes, I can see why memory consumption was growing rapidly.

Ultimately, I don’t think it matters. You just need to be aware that it happens. A modern computer will run the script without any notice to this phenomenon. You’d need to create a lot of objects to cause a problem. But I do beleive it is worth acknowledge that it happens so that you can be aware of the memory usage. It’s mostly noticeable on my work computer (the company provided) which only has 1 GBs of RAM (it runs Windows XP).

http://www.valuedlessons.com/2008/10/blog-post.html

Categories: Python, Random, Tips Tags:

Undocumented Methods in NSIndexPath – Row, Section

May 11th, 2009 Frank No comments

Did you know that the NSIndexPath object apparently has undocumented properties / methods? I’m working on a hobby iPhone OS Application that I hope to release and I’ve been trying to figure out how to use the instances of indexPath provided by: tableView:cellForRowAtIndexPath:

Full protocol:

1
2
tableView:(UITableView *)tableView
    cellForRowAtIndexPath:(NSIndexPath *)indexPath

I was flipping though my iPhone Dev book and was reminded that we have a section method in the instances of indexPath. I just wanted to post this here because this is (at this time) not on apple’s documentation.

Thus far, there are two that I want to mention: section and row.  Both of these most iPhone developers will know about… except if you are new at this… Then this can be perplexing…

1
2
    NSUInteger section = [indexPath section];
    NSUInteger row = [indexPath row];

Great SQL Server Tip

May 8th, 2009 Frank No comments
Categories: SQL Server Tags: ,

Book Review: Beginning iPhone Development

April 24th, 2009 Frank No comments

I’ve been wanting to learn how to develop iPhone applications for a while… Basically since the platform hit the streets. I found out the hard way that you need a Intel based Mac OS X computer to do so.

That left me out in the cold because I didn’t want to buy a Mac Computer just to do this, but I did. I bought a MacBook because I wanted a laptop. On a side note, I love the computer. The hardware is much better than Dell (whose been sliding on the quality of their hardware) and furthermore, Mac OS X is very, very elegant.

Anyway, I dived in to the iPhone SDK with the hopes of getting my first program out the door quickly. While I was able to get programs together and working, the platform is very, very different. To top it all off, the programming language is Objective-C which is quite different from any other language I’ve used.

Beginning iPhone Development: Exploring the iPhone SDK

Beginning iPhone Development: Exploring the iPhone SDK

I was up to the challenge but I needed help, so I recruited a book. The book is: Beginning iPhone Development: Exploring the iPhone SDK by Dave Mark and Jeff LaMarche.

The book was wonderful; I read it cover to cover in about 1 month. There are a few typos but I’m sure they’ll be corrected in a revision release, I’m sure. It is easy to read and easy to follow. I feel I understand everything the book was trying to tell us about. The authors go in to each topic enough for you to understand what they are talking about but they never go so far that they “lose you”.

A wonderful aspect is that their is an active forum to answer questions specifically about your work or learning in the book. Since others have gone though the capture, other readers can normally hep. Though, if all else fails, the authors will help you.

This book won’t make you a master iPhone developer, but it lays a very solid foundation in which to build your skills upon.

Overall, I highly recommend this book for anyone who wants to learn about developing iPhone or iPod touch applications.