All posts by Gary Berger

Intel + RapidMind = Better utilization of multi-core

It is long understood that programming for SMP and multi-core architectures are extremely difficult in many factors. Managing concurrency and serialization are important aspects of multi-threaded application development.

The purchase of RapidMind by Intel boosts Intel’s toolbox for helping the compiler jockeys fully utilize multi-core/many-core architectures for their C++ programs.

RapidMind offers simple constructs for developers to decorate their code utilizing an open interface through an array construct to remove the developer from worrying about the atomic entity of execution.

RapidMind provides an abstraction layer which allows developers to take advantage of Intel multi-core, GPU and Cell Broadband.

Way to go Intel.. The RapidMind is a great addition to help developers really exploit next generation data center architectures.

What Should Be VMWares Next Move

I wanted to point out an interesting article posted here on CIO.com.

Here is an excerpt,

“The most glaring omission [in VMware’s portfolio] is [the] need for Java object distributed caching to provide yet another alternative to scalability,” Ovum analyst Tony Baer said in a post to his personal blog on Tuesday. “If you only rely on spinning out more [virtual machines], you get a highly rigid, one-dimensional cloud that will not provide the economies of scale and flexibility that clouds are supposed to provide. So we wouldn’t be surprised if GigaSpaces or Terracotta might be next in VMware’s acquisition plans.”

Now I couldn’t be more happy that someone besides myself recognizes that in order for services to be uncoupled from the persistence layer you must have a distributed caching system. There are several players not all created equal but all with value in this field. They include  Gigaspaces, Terracotta, Oracle (Tangasol) Coherence and Gemstone.

Distributed caching is nothing new and most of the large internet companies like FaceBook, Twitter etc are utilzing open source tools like memcache to get a very rudimentry distributed cache.

Gartner analyst Massimo Pezini is right on with his comment “I think one of the reasons why VMware is buying SpringSource is to be able to move up the food chain and sell cloud-enabled application infrastructure on top of their virtualization infrastructure,” Pezzini said. “It wouldn’t take much to make it possible to deploy Spring on top of the bare VMware — i.e., with no Linux or Windows in the middle

If VMWARE changes focus onto the JAVA stack they can be well on their way to building a complete service virtualization platform.

The JAVA platform has an opportunity to sit on the bare metal and provide a ubiquitous abstraction layer between the infrastructure and the application stack. If we look at Oracle JRocket, IBM Libra and Sun Maxine there is already much research in a baremetal JVM. Sun has also been working on a pure JAVA OS called Guest VM which eliminates Windows and Linux from the guest altogether and is wriiten in pure JAVA.

The realization that instance scaling (Virtual Machine Proliferation) which requires moving the complete server state from machine to machine is a very difficult and a dirty process. If we have abstracted the underlying operating system as a pure JAVA runtime we can migrate our JAVA applications very simply in fact it is the main usecase I demonstrated in my multi-part series which utiizes Gigaspaces as an In-Memory Data Grid.

Part 2: Using Groovy, Grails and Gigaspaces “3G”

Part 2: Utilize a dynamic language, one that really anyone can learn.

I chose to use Groovy and Grails for this project.. why?

Because of Groovys natural support for the JAVA language anyone with a background in the language can be productive. In fact since Groovy is a dynamic language which means it supports first class functions, closures, etc… It saves a lot of the developers time. Groovy is not a statically typed language which means you don’t have to declare the “storage” type before you use the variable.. You can actually recast a variable depending on the problem you are working on making the langage very fluid. Groovy supports about 98% (i think) of native JAVA

Groovy benefits (http://groovy.codehaus.org/)

  • Is an agile and dynamic language for the Java Virtual Machine
  • Builds upon the strengths of Java but has additional power features inspired by languages like Python, Ruby and Smalltalk
  • Makes modern programming features available to Java developers with almost-zero learning curve
  • Supports Domain-Specific Languages and other compact syntax so your code becomes easy to read and maintain
  • Makes writing shell and build scripts easy with its powerful processing primitives, OO abilities and an Ant DSL
  • Increases developer productivity by reducing scaffolding code when developing web, GUI, database or console applications
  • Simplifies testing by supporting unit testing and mocking out-of-the-box
  • Seamlessly integrates with all existing Java objects and libraries
  • Compiles straight to Java bytecode so you can use it anywhere you can use Java

Grails (http://www.springsource.com/products/grails) is an advanced and innovative open source web application platform that delivers new levels of developer productivity by applying principles like Convention over Configuration. Grails helps development teams embrace agile methodologies, deliver quality applications in reduced amounts of time, and focus on what really matters: creating high quality, easy to use applications that delight users.

Grails is built around SpringMVC. Groovy and GraIls behave like another web framework and dynamic language called Ruby on Rails…
Continue reading Part 2: Using Groovy, Grails and Gigaspaces “3G”

Gigaspaces powered Service Virtualization and the Cloud Part 1:

So i promised my pal Shay Hassidim over at Gigaspaces when I had time I would post the use case I demonstrated to some Cisco folks on the power of “Service Virtualization”.

To start off, “Service Virtualization” is nothing new.. Its merely another abstraction level on top of the ones that have existed in many forms throughout the decades of modern computing. You can look deep down in the edges of the Linux C-Library where there is a low level API called SYSCALL. This  interface  provides practically the lowest level of interaction from the OS to the CPU and provides the foundation for new services to be built on top..

So why should I care about this? Well as is written in the article The Free Lunch is Over: A Fundamental Turn towards Concurrency in Software. Single threaded applications will gain little improvement over the next decade as clock rates are stalled to reduce power. These applications eventually may get slower as we pack hundreds even thousands of cores on a socket. Scale-Out and utilizing proper concurrency methods allow us to break the problem into many smaller chunks and put thousands of little workers to attack the problem.

With the thermodynamic barriers in chip design driving us wide instead of tall (horizontal instead of vertical, scale-out instead of scale-up..) Okay enough analogies…,

New architectural patterns are emerging including space based architectures and event driven architectures to address the developers obstacles, a lot of the work is rooted in a more simplistic look at the application stack built on top of the well understood JVM or CLR.

The new resurgence of functional languages like Erlang, Clojure are getting more interest as well as new languages like Ruby, Groovy and Scala.

The feature rich and pithy implementation of the language offering things like no need for semicolons or parens. In a lot of cases this reduces the noise and allows the code to be read more like plain English.

This more “readable” code allows the developers to communicate with the domain experts in an easier way. Speeding up the dev/test cycle to get “good enough” code out the door for consumption.

In essence the business interface gathers around a new DSL (Domain Specific Language) which is written in the semantic language of the business. Inter-dimensional relationships are easily described through method libraries.

So what was the demo about?

This entire demo was used to show how you could build a working JEE (like) application, instaniate it over a grid which is provided by an L3 encrypted overlay and allow the application to migrate around the cloud, in and out at will. My thesis is that you can create a movable matrix of hosts around a graph of cloud resources regardless of the source… We don’t take into case any global load balancing scenarios, that’s a problem for another day.

So how do we do that…

My demonstraton had several parts:

Part 1: Figure out how to configure systems in a stateless Amazon EC2 world…

Part 2: Utilize a dynamic language, one that really anyone can learn the semantics to and be productive as a programmer with the correct style guide and well documented API..

Part 3: Connect multiple hosts across autonomous cloud entities utilizing a SSL VPN overlay network. This provides the illusion of being on the same L2 network… Oh and it has to support multicast forwarding so I can have dynamic service discovery…

Part 4: Establish the VPN, Key assignment using CohesiveFT VPNCubed and OpenVPN

Part 5: Batch the images on Elastic Server.. threw some at Seattle some at Dublin other was me in NY.

Part 6: Customer writes into the application and the data is instanty synchronized across the world

Continue reading Gigaspaces powered Service Virtualization and the Cloud Part 1:

Powershell and Microsoft Exchange

I thought I would post some work I did a few years ago showing the powerfull PowerShell scripting language.

getMDB attempts to scatter requests for mailbox assignment by using a random number to choose servers over a region where there are just two regions in this example (East/North) and (West/South)

It uses a COM hook into the Exchange Server to read server attributes and storage groups.

 #  Name.......: getMdb()
 #  Description: Used to select a random mailstore based on region and<!--DVFMTSC--> </code><code>archive attributes
 #  Inputs.....: $region (North, West, South, East) Used to select destination mailbox server
 #               $archive (True, False) Used to select proper mailbox store based on journaling flag

function getMdb(){
param($region, $archive)

#Local Variables
$mailstore = @()
$archivestore = @()
$mstore=$()

$Random = New-Object Random

#Exchange server selection process based on region

if($region -like "EAST" -or $region -like "NORTH"){
$mservers = "server list A..."
}
elseif($region -like "SOUTH" -or $region -like "WEST"){
$mservers = "server list B..."
}

foreach ($server in $mservers){

$excObj = New-Object -ComObject CDOEXM.ExchangeServer
$sgObj = New-Object -ComObject CDOEXM.StorageGroup
$excObj.DataSource.Open([string]$server)
$sgs = $excObj.StorageGroups

foreach ($objItem in $sgs) {
$sgObj.DataSource.Open($objItem)
$dbs = $sgObj.MailboxStoreDbs

foreach ($mstore in $dbs){
if($mstore -like "*Zantaz*"){
$archivestore += $mstore
}
else {
$mailstore += $mstore
}
} # EndFor
} # EndForEach
} # EndForEach
"LOG: getMdb - Return mailstores $mailstore" >> $logfile

if($archive -eq $True){
$index = get-RandomElement($archivestore)
$global:mdb = [string] $zantazstore[$index]
}
else{
$index = get-RandomElement($mailstore)
$global:mdb = [string] $mailstore[$index]
}
write-host "Return MDB $($mdb)"

"LOG: getMdb - Return MDB $mdb" >> $logfile
}
#// -end getMdb

What does IT know about stability and viability of a sofware provider?

After reading the response from BusinessWeek writer Rachael King on Dennis Byrons’ blog post “Is BusinessWeek out to Get the Enterprise Software Business?” I ask myself how close to the truth is the following comment:

“IT departments need to think about the stability and viability of the software provider”

How does one assess “viability”, is it the software providers balance sheet, number of developers, R&D budget. Is it the number of bugs, patches, updates in their software packages or how well they respond to problems in a timely manner?

Now there is plenty of Enterprise Software out there which provides the backbone of major corporations, institutions and government. Microsoft Exchange and Active Directory have been pivotal in providing a relatively stable platform for services like email and authentication to the business. But I would argue this is not where businesses make their money. Lets be honest, businesses must learn to utilize their product and customer knowledge along with their financial strength and appetite for risk in order to differentiate. There are untold secrets deep within the corporate data repositories that need to be unlocked, normalized and mined for opportunities. Business intelligence is a giant and sticky ball of twine which needs to be untangled. This is where software development and IT work together to deliver exceptional value.

The truth of the matter is that software development is moving faster than ever, and businesses who don’t take a hold of their application portfolios are doomed to repeat the missteps of the past. Does anyone remember the protocol wars (IPX, IP, SNA), Y2K, as well as the myriad of worms, virus’s, malware that have infected versions of Windows for years?. How much of administrators times are wasted waiting for a reply on a bug from a large enterprise software provider?

If we look at modern software practices in Open Source you would find a scary process by which 1000’s of individuals are contributing towards building something that couldn’t be sustained by even the largest software development houses like IBM and Microsoft alone. Code enhancements, features and regression are all done by a community of individuals (some sponsored, some not, some anonymous) who make a worthwhile effort in building sustainability into a very dynamic system.

In fact the Linux 2.6 kernel changes so often that there is an ever evoloving process to test new ways of optimizing, tuning and delivering code. Functional weaknesses in the process are flushed out quickly by the community and are fixed on the fly (a sort of weak bonded neural network). This is no typcal software development project, and with almost Million lines of code and counting, the Linux Kernel is an unbelievably effective software development project. See here

With the 2.6.x series, the Linux kernel has moved to a relatively strict, time-based release model. At the 2005 Kernel Developer Summit in Ottawa, Canada, it was decided that kernel releases would happen every 2-3 months, with each release being a “major” release in that it includes new features and internal API changes”

Linux_Release

OpenSource gives the opportunity for everyone to peek inside, assess the viability of the code on its merits (not marketcture) and decide what parts are usefull for building competitive value. These code pieces are than layered together to provide a domain specific service applicable to the business.

I just want to take a moment to reflect on a critical piece of sofware development.

Back in my early days working with Oracle there were no client drivers for DBMS access like there are today with ODBC,JDBC, etc… In order to execute a query against the Oracle database you had to use something called the Oracle Pro C precompiler. What this did was take your ANSI SQL statement and turned it into abunch of C Language contructs which had to be compiled into an executable.

Luckily those days are gone, with the adoption of VMM’s, para-virtualization and robust runtimes like JAVA the developer can spend more time being creative rather than doing the janitorial work of cleaning up to conform to the underlying infrastructure.

More and more intelligent layers are being built into the architecture stack providing everything from In Memory Data Grids, Clustered File Systems and new execution patterns like Map/Reduce. This layer is called in Cloud Taxonomy jargon Platform as a Service. These services abstract the complex nature of resource management away from the SaaS architect allowing them to deliver compelling value added services.

In summary, yes IT needs to think about the stability and viability of the software provider but they also need to take responsibility for their own development destiny. We need to reward creativity, responsibility and attract more students to computer science and programming technologies. The problems we see in software development won’t go away, in fact things are going to get harder before they get better. So hack on..it will be a wild ride…

Next we will discuss these layers in more depth harnessing the taxonomy of cloud to describe Platform as a Service.

-g

Welcome

So finally after many years I have decided to put my thoughts down in a weblog. Hopefully someone besides me will find these ramblings beneficial.

For the most part this is a place to track the work I do related to bringing cloud computing to the enterprise.