I still rememberPosted by Holger Schauer in
Emacs
Betrand Mathieus blog article on cleaning up trailing whitespace comes in handy for me as I've frequently run into problems with the Python test coverage tool stumbling over trailing whitespace. It also reminded me of an emacs snippet I recently installed to detect a mix of space and tabs in my Python buffers -- I do set
(setq indent-tabs-mode nil) in my .emacs for python-mode, however, I still occasionally somehow manage to insert some tabs in my source buffers. So I came up with the following snippet which validates that a buffer in python-mode doesn't contain any tabs. It's hooked up with the very general write-file-hook, but there is no python-specific hook on saving buffers. In case the buffer does contain any tab, it will leave the point (the place where your cursor will be) at the fount tab.
Enjoy. Update: I should clarify that my previous reference to (set-indent-tabs-mode nil) is just a local function around setting the variable indent-tabs-mode directly. I use it also for a local toggle function, like so:
ObTitle: Bloc Party, "Weekend in the city" I guess I'll hang out my tears to dryPosted by Holger Schauer in
Programming
There seems to be a roaring discussion about a talk by Robert "Uncle Bob" Martin at this years RailsConf, which mainly seems to be concerned with the question (raised by a quote appointed to Ward Cunningham) whether languages such as Smalltalk make it too easy to create a mess. Languages such as C++ would penalize the mess at least by longer compile times.
As Giles points out that's an old argument, which I too have heard too often in the past about dynamic languages, regardless whether we're talking Smalltalk, Ruby, Lisp, Python or Perl. And the anti-dote was always the current idiot language of the date, i.e. Visual Basic at its high time, Java or C# (for those who considered C or C++ to be too dangerous). Please note that I'm not at all saying that users of those languages are idiots, but those languages are particularly claimed to be usable by idiots, too. This approach to languages or to their classification does a great job of misrepresenting programmers regardless of the language of their choice. It makes the bold claim that there are programmers that are idiots and that it's possible to design a language that even those idiots can do their jobs. And the exaggeration is that its only good business sense to enforce the use of those languages for all development projects, because, as we all know, there are more idiots than intelligent programmers out there, right? The only problem with this claim is that human stupidity is probably the only thing larger than the universe, if you make something idiot proof, nature will quickly make a better idiot. To put it a little less cynical, to err is human. And there is value in people making mistakes. As for instance, Tom DeMarco claims in his book "Slack" it's a particular good way to learn. Bob Martin tries to sell excessive testing (which isn't particular surprising, giving his background) as one safety guard to which James Robertson responds that this is not always necessary: Yes, tests are useful. But, the debugger is not something to be feared. Rather, it's a great tool to be used in order to have the computer do all the memory work for you. I can get a lot more done by working with decent tools like the Smalltalk debugger than I can by assuming the doc is good and writing tests that just help me a whole lot less than you might think. I can relate to this quote, being a fan of interactive development that languages like Lisp or Smalltalk made popular. But I think James is missing the point: tests are not only helpful during development. Sure, using an interactive approach helps getting the code run but does it help to come up with a clean design? Looking back at some code, I can assure you it doesn't, quite to the contrary it's a way to get a working solution But the real point summing up the discussion so far is the old Frederick Brooks quote: There is no silver bullet. It's good to be careful and to use the tools at your hands to help you avoid the big mistakes. As Paul Graham puts it for OOP: "For programs that would have ended up as spaghetti anyway, the object-oriented model is good: they will at least be structured spaghetti." But in the end you have to learn how to avoid the mess in the first place and no language and no tool can do that for you. ObTitle: Dexter Gordon, "Ballads" Ever fallen in lovePosted by Holger Schauer in
Computer, Music
Through this infoworld article on the greatest cults in IT I've discovered a new band: Press play on tape. The music isn't exactly what I'm listening to daily, but I have to admit I know probaby nearly all the songs they play: cover versions of old commodore C64 tunes. I also have to admit that I'm still not sure what to make of this, but the cover version of "Crazy comets" is quite cool.
Einmal Hongkong und zurückPosted by Holger Schauer in
Freiburg, German
Gestern abend trat Rainald Grebe im Rahmen der sogenannten Geflügeltage im ausverkauften Vorderhaus der Fabrik auf. Einem Freund hatten wir zu verdanken, dass wir doch noch an Karten kamen (danke nochmals, Martin). Ich hatte mir ja so einiges versprochen von dem Auftritt, aber es war ein wirklich fantastischer, sehr lustiger Abend: Ich habe lange nicht so viel Sprachwitz erlebt [*]. Dass er die Dynamik, mit der er seine Texte mit und ohne Klavier vorgeträgt, über zweieinhalb Stunden durchhält, erklärt sicher auch, warum Rainald Grebe ein bisschen wie ein Hungerhaken aussieht. Ein sehr cooler Abend und eine wärmste Empfehlung an alle, sich den Herrn mal genauer anzusehen und anzuhören. BTW: Die Geflügeltage mit weiteren guten Leuten gehen noch bis zum 9. Mai. Und im Juli kommt dann Hagen Rether ins Theater (ebenfalls von der Fabrik veranstaltet).
[*] Das stimmt nicht ganz: Max Goldt, kürzlich im E-Werk zu Gast, ist in Sachen Sprachwitz auch nicht zu verachten. Aber eine Lesung hat nicht so viel Dynamik wie ein Klavierkonzert und bietet mehr Möglichkeiten, die Grebe auch zu nutzen weiß, zu improvisieren und zu spielen. Die ein oder andere Grimasse und sonstige Aktion unterstreicht das Ganze dann noch. Love is a loosing gamePosted by Holger Schauer in
Linux
Some time ago I updated my trusty old workstation from Debian Etch to Lenny. As always, there were some minor glitches, not necessarily all of them were due to Lenny itself, though. Here's a rough list.
One really annoying issue was that my trusty old Matrox G400 suddenly behaved unbelievably slow under X11. I could nearly watch the pixels going by. I could tell by the X server log that some things had changed (like Xrandr now running) but I couldn't tell what was responsible for the problem. Explicitly setting the option "NoAccel" to "false" and "NoHal" to "true" (I have a single screen setup) finally settled the problem. I also tried "UseFBDev" "true" but this only works for me with a Depth of 24, which in turn screws my screen whenever I switch to the console (for instance on hibernate), so I'm back to not using the framebuffer device (and a depth of 16). The Postgres update didn't work for me. I guess that this was mainly due to a lack of disc space. It took me some time to figure out that no default cluster had been created, which was the reason that the manual restore of the DB backup I had made failed. I took the chance and finally updated to Iceweasel (firefox) 3. Some of my extensions were lost this way, but I managed to replace or update nearly all of them. Only Hit-a-Hint required a manual intervention directly in the install.rdf of the extension. One typical annoyance of updates is that user settings are often not updated. This time I experienced that mainly with FBpanel in which all icon settings were broken. Of course, the update broke some of my proprietary programs because of library problems. Korrektor, BMM, all are history now. As are some trusty old programs I still used, for instance xmms is gone, too. I used to stick with xmms mainly due to the Windowmaker mini app wmusic which allowed control of xmms via the dock, but as I've been using Rhythmbox for quite a while now, which has a NETWM compliant panel control, I could finally let it go. ObTitle: Amy Whinehouse, "Back to black" Busy doing nothingPosted by Holger Schauer in
Linux
I've got tired of writing perl snippets just for simple shell tasks [*]. One such typical task is summing up the occurences of a particular pattern in a set of files to get a total of matches. Of course, this is trivial, but it's so trivial I tend to forget the use of 'expr'.
As a one liner for copy&paste: [*] The following quote is due to Kristian Köhntopp: "Use perl. It's necessary to know shell programming but not to use it." ObTitle: Love is all, "Nine times that same song" I write sins not tragediesPosted by Holger Schauer in
Emacs
One of the nicer extensions of mozilla I use is It's all text. It allows you to call an external editor to edit text fields which is a really nice thing if you're editing longer wiki pages, for instance.
Not surprisingly, my external editor of choice is XEmacs. But my XEmacs is heavily configured and loads a lot of stuff. Even worse, I usually run a beta version with debugging turned on, so that it runs extra slow, so the time gap between clicking on that little "edit" button below the text field and the moment when I could finally start typing started to annoy me quickly. I remembered that some years ago I used to log in remotely to a running machine, fire up some script and have either my running XEmacs session connected or a new XEmacs process started. Connecting to the running XEmacs happens via gnuclient, so much was clear but gnuclient doesn't start a new XEmacs if there is none running (actually, the server process gnuserv will die when the XEmacs process terminates, so there isn't much gnuclient can do). The emacs wiki page linked to above already contains a number of scripts that eventually do what I need to do, but I've found none of them convincing, so here is my version. It's linux specific in its BSD style call to 'ps' to determine the running processes, but should be portable sh otherwise. I could have sprinkled the fetch_procs function with OS specific variants, but as I'm currently running my XEmacs on linux only, I left that as an exercise to the reader.
ObTitle: Panic at the disco, from the album "A fever you can't sweat out" New records for the massesPosted by Holger Schauer in
Music
I haven't written a lot under this title in the last year, mainly because I never felt like it. This doesn't imply I stopped writing for plattentests.de (German), but I didn't write as much as I did. That won't change, however, I'll see if I can write a little about the music I get my hands on. I won't cover here now the entire 2008, but 2009 is still relatively fresh, so I'll try to talk about the records I've heard of this year so far.
My latest review for plattentests.de is the new from Bob Mould, "Life and times". That's the guy who kicked punk from hardcore to post-punk with his band Husker Du and also the one whose alternative band Sugar was quite succesful in the first half of the 90s. The new record is a mix of more quite songs and the usual alternative rockers, sugar-style, so if you know Mr. Mould, I can assure you you'll get what you pay for. (7/10) The previous review was about Demons, "Ace in the hole". If you like punk rock firing straight out of the garage, along the likes of New Bomb Turks or Turbonegro, these Swedes are going to kick your ass. (7/10) This years favourite so far is the new album "The century of self" by ... And you will know them by the trail of dead. I'm planning to write a longer review of the record here (i.e. for my blog), but I can say so much that I like it much more than their previous record "So divided". The new record contains at least partly more noisy songs and is a little less pathetic, so over all well done. Although, I must admit the surprises are not so numerous this time. (8/10) Another nice record is the new one by The Decemberists, "The hazards of love". It's somewhere between a folk and a traditional hard-rock album, but what I find interesting is that it's a completely interleaved story about "Margaret". Great story telling, nice music, what else do I need? (7/10) Then there is the new Franz Ferdinand, "Tonight: Franz Ferdinand". Although I've become rather sick and tired of this kind of disco indie pop, Franz Ferdinand are somewhat of a special exception to me. And, hey, with "No you girls" in the advert every evening, everybody is going to talk about them anyway, so I'm kind of forced to buy it, no? However, I've probably shouldn't have give in to the urge. (5/10) Okay, so far so few. There's still a lot of records I have to check out: Whitest Boy alive, Morrissey, Mando Diao, The View all have new albums and the White Lies also got a nice review on plattentests.de. And there's guaranteed more to come. Wolfram alpha is going to continue to be alphaPosted by Holger Schauer in
Knowledge processing
It's always the same pattern: some well-known figure comes up with some idea and everybody jumps onto the bandwagon. I have the strong suspicion that media continues to fall into one of the famous logical fallacies, appeal to authority. Or perhaps it's just clever marketing. This time the hype is around a new project by the physics guy Stephen Wolfram of Mathematica fame: Wolfram alpha is coming and could be as important as Google. To cut a long boring blog entry short, I wouldn't hold my breath.
Question answering is a quite old sub discipline of computational linguistics, which nethertheless has seen a lot of progress in recent years. Still it happens to be a pretty hard task even in closed domains or for a given training set (see results of various TREC conferences, where TREC is an acronym for text retrieval conferences). Question answering in the open domain, as Wolfram alpha seems to address, is not one but multiple magnitudes harder: all of a sudden you no longer have a controlled terminology for queries and the amount of information you have to index and search is unbelievable. There have been attempts in the past to deal with this problem. One particular well-known approach was or is CyC, which tried to build a huge knowledge base of every day knowledge. There have been several attempts to use Wordnet and more recently Wikipedia as a source of answers to questions. Even Microsoft tried to build a knowledge base from the data of its Encarta product. So, why don't we have already a well functioning open domain question answering system if people are trying to build one for like fifty years? See above: because it's really hard. Think about the parts involved: Information extraction in itself is not easy. Query parsing, as easy as it sounds, isn't trivially, either. Matching a query to an extracted piece of information usually requires a sophisticated system of knowlege representation or a similarly sophisticated statistical system. And text generation isn't a piece of cake either. So what is it that Wolfram alpha makes different? From the very fuzzy amount of information we can see it's really hard to judge but I have very serious doubts that they've found the holy grail of QA. Unfortunately, we haven't seen any tests of their system but I guess there's a reason for it. As for instance can be seen by the often ridiculous search results that the previous-google-killer-hype cuil offers, overcoming real world trouble like filtering out irrelevant or false data can be a major obstacle. Wolfram alpha of course doesn't have to filter out irrelevant web pages but it has a problem that is probably even greater: to filter out false claims on input data (assuming that they're operating on public available data) because otherwise they end up with wrong answers which would be disastrous. But on which grounds could you automatically filter out "wrong facts"? You would already have to know the "correct" ones. So this leaves us with a lot of handcrafting, say to build a knowledge base of facts which they can answer. However, we've seen in the past that any handcrafted knowledge base requires vast resources and constantly so -- which is the major reason why wikipedia is a real problem for traditional encyclopedia publishers. Now remember that with Cyc there already has been an attempt to build up such a knowledge base and they've been working on it for roughly twentyfive years now and it's still not a system that is useful in reality (see, for instance the list of criticisms of the Cyc project on the wikipedia page on CyC. But let's go back to that sensational article linked to above: in all seriousness, it's quite unlikely that even if they can build a system that can answer a lot of questions they're gonna get as important as Google is. Besides the fact that Google has enourmous resources including a lot of guys who know a lot about computational linguistics and is hence likely to come up with a similar system if necessary, Google has not been only a simple search engine for quite some time now. Google nowadays is in no way comparable to what it was ten years ago, their major service they provide is information access in a large variety of ways, including multiple media sources, integration of social interaction services . Fact retrieval and question answering (mainly based on texts) is certainly important but information access encompasses a lot more. We were dead before the ship even sankPosted by Holger Schauer in
Computer
This list of top tips for project managers is what Dilbert is all about. For many probably too close to home, too near the bone ...
(PS. I've managed to sneak in two musical references into this one. Let's see if somebody figures them out.) Kleine Unterbrechung für einen VeranstaltungshinweisPosted by Holger Schauer in
Freiburg, German, Music
Ich selber werde am morgigen 20.2. nicht in Freiburg, sondern beim Skifahren anderswo sein, aber wenn ich hier wäre, würde ich mir das morgige Konzert im Waldsee von The Horror The Horror anhören. THTH machen Indie-Gitarrenpop im klassischen Stil, so zwischen den Strokes und den Smiths (Reviews auf Plattentests.de siehe hier und hier). Es ärgert mich fast ein bisschen, dass ich das Konzert verpasse, denn ihr letztes Konzert im Jos Fritz war überfüllt und von miesem Sound geprägt. Im Waldsee dürften die Voraussetzungen besser sein.
Close rangePosted by Holger Schauer in
Linux
I can't believe it: Debian really misses out on it's own records. Lenny was released this weekend although the usual two years since the previous release are not over yet. Congratulations. Now, I just have to go and check how old the packages of interest to me are out of the box this time ...
Testing and terminology confusionPosted by Holger Schauer in
Programming
I may be rather late to the game, but over the last one and a half years, I've become quite addicted to writing tests during my development tasks. I've had wanted to dig into test-driven development for quite some time, but it was the seamless integration of Test::Unit, Ruby's unit testing module, in Eclipse that got me going initially. I then did some unit testing with Common Lisp packages and am currently heavily using pyunit and python doctests (mostly in the context of zope testing). Writing tests has become my second development nature: It gives you that warm fuzzy feeling that you have that little safety net while modifying code.
However, there are times when terminology comes along and gives you a headache. A terminology I've learned about during the last year is the difference between unit testing, integration tests and functional tests (for an overview see wikipedia on software testing). But as you can see for instance in this article on integration tests in Rails, it's not always easy to agree on what means what -- Jamis and/or the Rails community seem to have the integration/functional distinction entirely backwards from what, for instance, the Zope community (on testing) thinks. Now, one might argue that terminology doesn't matter much given that you do write tests at all, but it's not so easy. For instance, if your "unit test" of a given class requires another class, is that still unit testing or is it integration testing? Does it even make sense to talk about unit-testing a class? A class on its own isn't that interesting after all, it's its integration and interoperation with collaborateurs were the semantics of a class and its methods become interesting. Hence, shouldn't you rather test a specific behaviour, which probably involves those other classes? And what now, if your code only makes sense when run on top of a specific framework (Zope, Rails, you name it)? Michael Feathers argues convincingly in his set of unit testing rules that any such tests are probably something else. Ultimately these questions directly pertain to two aspects: code granularity and code dependencies -- and remember, test code is code after all. These are directly related, of course: if your code is very fine-grained, it's much more likely that it will also be much more entangled (although the dependency might be abstracted with the help of interfaces or some such, you still have the dependency as such). And as a consequence, your test code will have to mimick these dependencies. On the contrary, if your code blocks are more coarse-grained (i.e. cover a greater aspect of funcionality), you might have less (inter-)dependencies, but you won't be able to test functionality on a more fine-grained level. As Martin Fowlers excellent article Mocks aren't stubs discusses in detail, one way to loosen these connections between code and tests is to use mock objects or stubs. Fowlers article also made clear to me that I've used the term "mock object" wrongly in my post on mock objects in Common Lisp: dynamically injecting an object/function/method (as a replacement for a collaborator required for the "code under test") that returns an expected value means using a stub, not a mock -- another sign of not clearly enough defined terminology (btw, the terminology Fowler is using is that of G. Mercezaos xunit patterns book). It's worth keeping these things apart because of their different impact on test behaviour: mocks will force you to think about behaviour whereas stubs focus on 'results' of code calls (or object state if you think in terms of objects being substituted). As a result, when you change the behaviour of the code under test (say you're changing code paths in order to optimize code blocks) this might (mocks) or might not (stubs) result in changes to the test code. It's also worth thinking about mocks and stubs because they also shed a new light on the question of test granularity: when you're substituting real objects in either way, you're on your way to much more fine-grained tests, which implies that you loosen the dependency of your tests: You can now modify the code of your collaborateur class without the test for your code under test breaking. Which brings us back full circle to the distinction between unit tests and integration tests: you now might have perfect unit tests, but now you're forced to additionally tests the integration of all the bits and pieces. Otherwise you might have all unit tests succeed but your integrated code still fails. Given this relationship, it seems immediately clear that 100% test coverage might not be the most important issue with unit tests: you might have 100% unit test success, but 100% integration failure at the same time -- if you don't do continuous integration and integration tests, of course. Now what's interesting is that it might be possible to check test coverage on code paths, but it might not be easy to check integration coverage. I would be interested to learn about tools detailing such information. Recently I had another aha moment with regard to testing terminology: Kevlin Henney's presentation at this years german conference on object oriented programming, the OOP 2009, on know your units: TDD, DDT, POUTing and GUTs: tdd is test driven development, of course. The other ones might be not so obvious: "guts" are just good unit tests and "pout" is "plain old unit testing". I saw myself doing tdd, but come to think of it, I'm mostly applying a combination of tdd, pout (after the fact testing) and ddt: defect driven testing. I find the introduction of a term for testing after the code has been written interesting because it provides a way to talk about how to introduce testing in the first place. Especially defect driven testing, the idea to write a test to pinpoint and overcome an erroneous code path, might be a very powerful way to introduce the habit of regularly writing (some) tests for an existing large code base. So you avoid the pitfall of never being able to test "all this lots of code because there is never the time for it" and you might also motivate people to try writing test before code. And on this level, it might at first not be that relevant to make the distinction between integration and unit tests to clear: start out with whatever is useful. A cloud of words: yes, they canPosted by Holger Schauer in
Knowledge processing, Politics
I'm not one of the people to get over-excited by the new US presidency, hence I normally wouldn't have any reason to say anything about it here. As a computational linguist, however, I can't let ReadWriteWeb's word cloud comparison of Obamas inaugeral speech to former speeches go unnoticed. From a casual look, it seems as if Bush communicated a lot clearer what his presidency would be about, at least looking back on the last years. Of course, all that talk about liberty and freedom was probably just advance justification for the aggressive actions to come. The word cloud analysis of Obama looks much broader but also much more unspecific to me -- which matches the image I got from the media pieces of his previous election speeches, too. It will be very interesting to see if Obama can fulfill all the wishful thinking people approach his presidency with (I wouldn't hold my breath, though) -- and, in some future time, how one might look back on that word cloud and which interpretation one is going to associate with all these terms.
Wlan updatePosted by Holger Schauer in
Linux
Some time ago I started having nothing but trouble with my rt2570 based USB wlan stick. After we moved into our new flat, I finally started out and looked for a better solution. Looking through some retail market I found a package claiming it would support linux. After I was ensured I could return the thing in case of trouble (like outdated proprietary drivers), I bought it -- a Dlink pci card. After installing the thing, I quickly discovered that it's again a Ralink based product -- this time a RT61 based one, requiring firmware. I'm not happy with that but at least my workstation has wifi again.
One thing has changed, btw.: the newer kernels (I run 2.6.27 at the time of this writing) happily support the driver out of the box (i.e., after a recompilation and figuring out where to put the firmware), no more manual compiling and patching of drivers is required. And then you can also use wpa_supplicant to configure the network, which makes the entry in /etc/network/interfaces alot simpler:
Unfortunately, as you can probably guess, my connection is rather lousy and setting the rate up doesn't help much. It's wlan1 for me, btw., because I told /etc/udev/rules.d/z25_persistent-net.rules to configure it that way.
(Page 1 of 10, totaling 136 entries)
» next page
|
QuicksearchBlog AdministrationKategorienTagsCalendar
Powered byBlog abonnieren |
|||||||||||||||||||||||||||||||||||||||||||||||||
Dieser Blog wird von 1on.de zur Verfügung gestellt; einem kostenlosen Dienst der IDEE GmbH
Powered by Serendipity 1.3.1.
Design by Carl Galloway.

