Simple Features Thoughts about simple features in software development

28Jan/120

How did you fail now, Mr. Dumbass!

What was your failure now, Mr. Dumbass?

I have seen a lot of things about Node.js in the last 12 months. In the beginning I started following the hype on blogs and Twitter. I was interested in knowing what was so great about a JavaScript based socket server. The first time I thought it was something like a WebSockets framework. Then I understood it was something else, built to live in the server side and I got even more amazed. I thought there was something I wasn't aware of, like I was missing the great part of having JavaScript on the server side. I was ready to start working on a new project that involved Node.js so I could see its potential. I studied more about it. It was the moment I realized what was under the box of amazement: it was nothing. At least nothing new.

The real hype was because it was developed in a language that is affordable for almost every developer out there. Affordable not because its easy to learn but because it's everywhere. Even plenty of web designers know JavaScript decently enough.

I'm not saying it doesn't work, or it doesn't work correctly (in the end I haven't even used it).

The hype was driven because a lot of people had never worked with sockets and its various flavours (socket event handling included), people that has always been far from working with multi-threading (and it's debugging issues), people that has always programmed with a Software Architect by his/her side and using plenty of frameworks. People who has never had to profile and debug, and some of them who have never had to improve memory and throughput consumption when working with byte level protocols. And it's not that I think that low level programming is more interesting or more difficult, or just for computer geniuses. It because there are people that does that kind of programming, some other are great web designers, some others are great architects while some other code a lot of useful boilerplate.

Now, what happens when someone with a decent knowledge on JavaScript, but no experience at all in systems and network management, data transfer rates, network security, etc., starts opening network sockets all over the infrastructure: format errors, security breaches, stack overflow issues, and so on... because it takes a lot of time for server systems to get mature. So is with server developers. You will not make the next Apache just because you know JavaScript and know a bit about sockets; you will make the next Apache when you have gone a long way with development, network understanding, systems administration and many other stuff, and then you know a language better suited for low level programming.

And the real problem is with people that does stuff just because they can, not because they know. We have a lot of tools that deals with all of the "I don't know how"'s and turns it into an "I have a tool for that!". "I know a wizard", "I've read a tutorial", "My IDE makes it very simple" and many others also stand for the same kind of technology abuse.

Tutorials, wizards and tools are for improving learning and development times, not for turning any developer into an instant expert of any subject. We all must go a long way before really knowing something, and sometimes we think something will work great just because the tutorial example works perfect.

I used to work in a company where my former boss thinks that any time he asked "What was the failure?", there would be someone among us that knew every coding line in our systems and would give him an answer sooner or later. In order to make it happen we would have to implement as much as possible by ourselves. It would take us plenty of development time, but there were no large delays in problem solving. Some of our advantages were that we had a controlled network environment, and that we grew as a development team together with the infraestructure. We were dumber when the network was smaller. We would always had an answer for such question.

Whenever you have a problem and you don't know the whole picture, and you don't know how to fix it, the question will change  from "What was the failure?" to "How did you fail now, Mr. Dumbass?".

Image: Ambro / FreeDigitalPhotos.net

27Nov/102

Java Dates – Is long a good idea for handling dates?

Answer goes: yes, most of the times, but...

12Nov/090

Texto, tiempo, y monedas

Hace un par de semana estaba por entregar un nuevo sistema de comunicaciones para AVL (Localización automática de vehículos). Nos percatamos que ese fin de semana, México finalizaría el horario de verano: la diferencia de -5 UTC cambiaría a la diferencia normal de -6 (HCM). En nuestra compañía establecimos un estándar para ejecutar los cambios de horario coherentes en todo el sistema de acuerdo con las especificaciones nacionales. Sin embargo ahora el sistema estaba reportando una diferencia de -4 horas.

¿Qué demonios pasa? ¿Como paso de que tenía una diferencia de -5 horas a la normal de -6 hasta una totalmente anormal diferencia de -4, y para hacerlo más complicado, 5 horas antes de que el cambio de horario ocurriera?

Este es el tipo de requerimientos que pueden volver a un programador en un lío en una nada de tiempo... Fue en ese momento que mi amigo Carlo y yo recordamos una de las cosas más importantes de la computación: luego de 60 años de innovación, nuevas tecnologías, logros sorprendentes en la ciencia, aterrizajes lunares, etc., no ha existido una resolución definitiva para tres temas principales.

  1. Monedas,
  2. texto,
  3. y tiempo

La moneda por si misma es un tipo de dato. La moneda no puede ser representada por un float ó por un double. El problema con la moneda no es como hacer aritmética simple con una clase Currency (cuya verbosidad puede hacernos querer cambiar de carrera), sino en toda la complejidad de establecer el valor del dinero cuando el sistema no está preparado para cambios de moneda... esto es una sentencia final de sistema.

El problema con el texto es acerca del manejo de caracteres (charsets). Hoy observaba un webcast donde una de las características fabulosas del procesamiento de texto en Silverlight 4 era la capacidad de mezclar charsets en el mismo documento. En un sólo párrafo estábamos viendo UTF-8 mezclado con otros textos unicode. Finalmente había un documento de texto con una característica sólo vista antes en tinta y papel, y sólo tomó sesenta años de tecnología e investigación.

Con el uso de formatos altamente verbales (estoy pensando en tí, XML) el almacenamiento y distribución de la información no es una tarea fácil cuando llega el momento de elegir los formatos. La primera vez que use XML simplemente copié el header que más comunmente encontré en los ejemplos, y siempre decía "UTF-8"; usar UTF-8 en español es como cortar el diccionario a la mitad. Después de algunos intentos empecé a utilizar ISO-8859-1 con todas las capacidades que esto me dió de usar el diccionario español completo pero, ¿Fué la mejor decisión de diseño?

El tiempo es una ciencia completa. Las zonas horarias son HORRIBLES: diferencias horarias de horas, medias horas y cuartos de hora (ver Nepal aquí y aquí y las islas Chatham aquí) entre las zonas horarias son simplemente una locura. Y hay mucho que responder con respecto al tiempo:

  1. ¿A qué hora pasó X? ¿Tiempo del usuario? ¿Tiempo del servidor? ¿Y que tal que mi aplicación es distribuída?
  2. ¿A qué hora sucederá X?
  3. Si cambia la zona horaria (aparatos móviles), ¿se dará cuenta mi dispositivo?
  4. Si estoy usando Linux y mi aplicación cambia a Windows, ¿seguirá siendo el tiempo coherente?
  5. y las preguntas pueden continuar ad-infinitum...

Los calendarios... ¿ha trabajado alguien con calendarios sin sentir la necesidad de ser enterrado vivo? No hay correlación entre todos los calendarios. Por ejemplo, observemos dos hechos del calendario judío: el año del 2009 al 2010 corresponde al 5770 de su era, y el año tien 354 días (ó 353, ó 355) divididos en 12 meses lunares, llenando los días restantes con un mes periodicamente.

El mayor problema acerca de estos tres temas que estamos tocando: la mayoría de las preguntas surgen cuando han salido a la superficie en la forma de un bug.

Una aplicación normal a ser usada dentro de una compañía debe estar atenta a estos problemas. Regularmente, un desarrollador sólo usará el primer enfoque que le de el lenguaje ó el IDE al momento de necesitar entregar un nuevo producto. Resolver estos temas tal con las opciones predefinidas se convertirá en una decisión que se heredará en tus pasos siguientes en la compañía, y en la decisión de la que SIEMPRE te arrepentirás.

Estos tres temas son la razon de existir de este blog. Al revisar estos temas podemos percatarnos que hay muchos requerimientos básicos en la programación que afectan el comportamiento de un sistema y que no están bien enfocados para los principiantes, y donde los programadores avanzados sólo dan consejos generales pero no respuestas directas. Estos no son necesariamente los únicos temas, pero estaremos trabajando en ellos frecuentemente.

En términos vagos, esto fué lo que pasó con la diferencia de cuatro horas: no consideré el comportamiento del lenguaje al implementar el estándar de la compañía. En detalle, esto será analizado y corregido la próxima semana.

12Nov/090

Time, text and money

A couple weeks ago I was about to deliver a new AVL (Automatic Vehicle Location) communications system. Then we realized that on that weekend, Mexico would finish its daylight saving time: the UTC -5 hour difference would then be the regular UTC -6 hour difference (CST). In our company we established a standard to set the time change coherently through the whole system according to the national settings. But now our system was reporting a -4 difference.

WTF?... How the hell did it go from a -5 returning to a regular -6 to an out of the question -4 and, making it more complex... 5 hours before the time change occurs?

That is the kind of features that can turn a developer into a complete mess in no time... that was the moment my friend Carlos and I remembered one of the most important things in computing: after 60 years of innovation, new technologies, outstanding achievements in science, moon landings, etc., there hasn't been a definitive approach for three primary subjects:

  1. Currency,
  2. text,
  3. and time

Currency, by itself is a data type. Currency can not be represented neither by a float nor a double type. The problem with currency is not how we make simple arithmetics with the Currency class (which verbosity in itself would make us want a career change), but all of the complexity in stating amounts of money when the system is not prepared for exchanges... it's a definitive system killer.

The problem with text is all about charsets. Today I was watching a webcast where one of the oh-my-god! features in text processing for Silverlight 4 was the capability for mixing charsets in the same document. In one paragraph we were watching UTF-8 mixed with some other unicode text. Finally, a text document had a feature only possible using ink over paper and it only took sixty years of technology and research.

With the use of highly verbosed formats (I'm thinking of you, XML) the storage and deployment of information is not an easy task when it comes the time to choose the formats. The first time I used XML I just decided to copy/paste the header I most commonly found on the examples and it always said "UTF-8"; using UTF-8 in spanish language is just slicing the dictionary in half. After some attempts I started using ISO-8859-1 with all of the capabilities it gave me to use the whole of the spanish dictionary but, did I make the right design decision?

Time is a full science. Timezones are a real P.I.T.A.: hour, mid-hour and even weird quarter hour differences (see Nepal here and here and the Chatham Islands here) between timezones are just insane. And there are many things to answer with time:

  1. At what time did X happen? user local time? server local time? and what if my application is distributed?
  2. At what time will happen X?
  3. If timezone changes (mobile devices), is my device aware?
  4. If I'm using Linux and then my application goes to Windows, will the time still be coherent?
  5. and questions may go on and on...

And calendars... has anyone worked with calendars without the need of beeing buried alive? There are no correlation between all of the calendars. For example, lets see two facts of the jewish calendar: years 2009 - 2010 are the 5770th of their time and, the year has 354 days (or 353 or 355) divided in 12 lunar months, filling the remaining days with a leap month periodically.

The major problem about these three things we are talking about: most of the questions arise once it has reached surface as a bug.

A regular company-wide internal application would have to be aware of a these problems. Regularly, the developer would only use the first approach the language and the IDE gives you when you have to deliver a new product. But solving it in a simple out-of-the-box manner will become a decision you will inherit all over your steps (and others´) in the company, and the decision you will ALWAYS regret of.

These three subjects are the reason of existance of this blog. By reviewing those subjects we can realize that there are many simple features all over programming that affect the overall behaviour of a system that are not well addressed for beginners, and that the advanced programmers sometimes just give general advices but not straight answers. Those are not necessarily the only subjects, but we will come back to those frecuently.

In rough terms, that was what happened with our 4 hour difference: I didn't take care of the language behavior when implementing the company's standard. In details, it will be analyzed and fixed next week.