Extended Zend_Ldap is in Standard Incubator

Zend Framework Logo (small)

After beeing accepted for Standard Incubator development by the Zend Framework team on November, 1st 2008, the extended Zend_Ldap component (formerly known as Zend_Ldap_Ext) has been moved into the Zend Framework Standard Incubator and can be checked out from there.

This is what the component is all about (from the proposal page):

The existing Zend_Ldap component currently just responds to authentication use cases in all their varieties. There is no posibility to query a LDAP directory service in a unified and consistent way. The current component also lacks core CRUD (Create, Retrieve, Update and Delete) functionality – operations that are crucial to for example database abstraction layers.
This proposals tries to resolve these deficiencies in that it provides a simple two-ply object oriented model to connect to, query and perfom CRUD operations on an LDAP server. The first layer is a wrapper around the ext/ldap functions, spiced up with extended functionality such as copying and moving (renaming in a LDAP context) nodes and subtrees.
The second layer (Zend_Ldap_Node) provides an active-record-like interface to LDAP entries and stresses the tree-structure of LDAP data in providing (recursive) tree traversal methods.
To simplify the usage of the unfamiliar LDAP filter syntax this components proposes an object oriented approach to LDAP filter string generation, which can loosely be compared to Zend_Db_Select.
Usefull helper classes for creating and modifying LDAP DNs and converting attribute values complete this component.
Furthermore it is possible to do some LDAP schema browsing and to read and write LDIF files.
It is important to note, that this proposal is a complete replacement for the current Zend_Ldap component and does not break backwards-compatibility.

Later today I’ll try to publish a short tutorial on the usage of the (hopefully) new Zend_Ldap component.

Advertisements

December 7, 2008 at 16:50 1 comment

On how to sort an array of UTF-8 strings

This article is based on a question asked by me on stackoverflow.com and illustrates the way I solved the question myself and discovered a PHP bug on Windows.

Sorting an array of strings in PHP seems to be a no-brainer at all. There are a lot of sort functions with sort() being the most common one. The problem arises when the strings used in the array are multi-byte encoded, for example UTF-8 encoded. Because PHP comparison functions cannot operate on those strings (they do a byte-per-byte comparison) sorting does not work as expected either. Furthermore language specific sorting properties are not taken into consideration when sorting with sort() and the default parameters. In Swedish for example an Ä is sorted at the end of the alphabet while in German Ä normally is equivalent to A (when using the DIN 5007 sorting method).

Fortunately PHP provides a function which copes with this problem: strcoll(). The function can be used for array sorting by just specifying the function name as the second parameter to usort(). The sort() function also has a flag (SORT_LOCALE_STRING) which actually seems to do the same as usort() together with a strcoll() callback.

To summarize we can say, that sorting an array of UTF-8 strings in a language aware manner is more or less simply a question of setting the correct locale. Let’s look at the following example using German as the reference language and saved with a UTF-8 encoding.

$array=array('Übergabe', 'Ostfriesland', 'Äpfel', 'Unterführung', 'Apfel', 'Österreich');
$oldLocale=setlocale(LC_COLLATE, "0");
setlocale(LC_COLLATE, 'de_DE.utf8');
usort($array, 'strcoll'); // or equivalent sort($array, SORT_LOCALE_STRING);
setlocale(LC_COLLATE, $oldLocale);

This will result in an array of Apfel, Äpfel, Österreich, Ostfriesland, Übergabe, Unterführung (obviously we’re using DIN 5007 sorting here).

As sorting now is locale-dependent we have to respect the PHP environment, which means what machine are we running our script on – Windows or *nix?

First of all, if we have a *nix machine, the used locale must be installed on the system. You can get a list of installed locales by issuing the command locale -a on the command line. Be sure to use the correct encoding with the desired locale – the encoding must match the string encoding.

Things get more complicated on Windows machines as locales are named differently. The default naming scheme is Country_Language.Encoding. Information on locales on Windows can be found on MSDN: Language and Country/Region Strings, Language Strings, Country/Region Strings and Code Pages. Furthermore encodings are not specified like on *nix machines but rather by using code pages. As we’re using UTF-8 in our example we have to use the UTF-8 Windows code page, which is 65001. Putting all these things together we get to a locale of German_Germany.65001 for our example. For the sake of completeness the normal code page for Western Europe would be 1252.

This leads us to the following code snippet (UTF-8 encoded strings):

$array=array('Übergabe', 'Ostfriesland', 'Äpfel', 'Unterführung', 'Apfel', 'Österreich');
$oldLocale=setlocale(LC_COLLATE, "0");
setlocale(LC_COLLATE, 'German_Germany.65001');
usort($array, 'strcoll'); // or equivalent sort($array, SORT_LOCALE_STRING);
setlocale(LC_COLLATE, $oldLocale);

What the heck???? Übergabe, Apfel, Ostfriesland, Unterführung, Äpfel, Österreich?? That obviously doesn’t work… What’s the problem? Let’s try to use non UTF-8 strings (don’t forget to recode the file to ANSI, Windows-1252 or ISO-8859-1):

$array=array('Übergabe', 'Ostfriesland', 'Äpfel', 'Unterführung', 'Apfel', 'Österreich');
$oldLocale=setlocale(LC_COLLATE, "0");
setlocale(LC_COLLATE, 'German_Germany.1252');
usort($array, 'strcoll'); // or equivalent sort($array, SORT_LOCALE_STRING);
setlocale(LC_COLLATE, $oldLocale);

Now we get Apfel, Äpfel, Österreich, Ostfriesland, Übergabe, Unterführung. OK, non-UTF-8 is working correctly. Let’s dig in deeper. What does strcoll() do with my array? Let’s trace what’s going on (thanks to Huppie for the idea of tracing what strcoll() is doing):

function traceStrColl($a, $b) {
    $outValue=strcoll($a, $b);
    echo "$a $b $outValue\r\n";
    return $outValue;
}

$array=array('Übergabe', 'Ostfriesland', 'Äpfel', 'Unterführung', 'Apfel', 'Österreich');
$oldLocale=setlocale(LC_COLLATE, "0");
setlocale(LC_COLLATE, 'German_Germany.65001');
usort($array, 'traceStrColl');
setlocale(LC_COLLATE, $oldLocale);

The output is:

Äpfel Ostfriesland 2147483647
Äpfel Übergabe 2147483647
Äpfel Unterführung 2147483647
Äpfel Apfel 2147483647
Österreich Äpfel 2147483647
Ostfriesland Apfel 2147483647
Ostfriesland Übergabe 2147483647
Unterführung Ostfriesland 2147483647
Apfel Übergabe 2147483647

As you can see strcol() returns 2147483647 on every comparison operation. This is reproducible and emerges only on Windows machines (by the way the PHP version does not seem to matter as I tried the snippet on PHP 5.2.4, 5.2.5 an 5.2.6). Actually this is what I’d classify as a bug. Therefore I filed a bug report on bugs.php.net: Bug #46165 strcoll() does not work with UTF-8 strings on Windows

Summary: Currently it is not possible to sort UTF-8 strings on a WIndows machine simply using PHP-provided functions. A possible solution would be to recode the strings to Windows-1252 or ISO-8859-1 encoding (using mb_convert_encoding() or iconv()) and do a sort on the recoded array (provided by ΤΖΩΤΖΙΟΥ on stackoverflow.com).

September 24, 2008 at 12:16 8 comments

Installed phpUnderControl on our development server

Just installed phpUnderControl and CruiseControl on our development server. Actually everything went quite smoothly and only the java installation made some problems as it was mentioned nowhere that you need to install the Java SE Development Kit (JDK) and that the Java SE Runtime Environment (JRE) is not sufficient.

These are the required software packages:

The only disadvantage of the current installation is, that we have to use SSH tunnels to get to the phpUnderControl dashboard. By a fluke I just stumbled on an article by Max Horvath who describes how to setup Apache to access the phpUnderControl pages via a proxy and therefore avoiding the SSH tunnels. I’ll try this later today.

August 15, 2008 at 14:43 3 comments

Bug in Zend_Cache – already fixed

Just wanted to blog about a Zend_Cache bug I discovered yesterday (ZF-3923) that made it impossible to use custom (namespaced) front- oder backends, when I noticed that the error already has been fixed in the trunk rev. 10895.

Thanks a lot to the person who fixed the bug – whoever it was (the ZF issue tracker is down right now)! Good job!

August 12, 2008 at 14:37 Leave a comment

Zend_Ldap_Ext proposal updated

Yesterday I updated the proposal wiki page for my Zend_Ldap_Ext proposal to be compliant with the current source. Currently there seems to be not that much interest in an extended Zend_Ldap component but in my opinion the framework should include such a component to allow for simple data exchange with LDAP servers the same way it provides means to talk to databases and webservices.

The proposals originates from my need for a unified data access layer to an LDAP server in my current project. The proposal features:

  • querying the LDAP server
  • retrieving LDAP entries
  • creating LDAP entries
  • updating LDAP entries
  • deleting LDAP entries
  • LDAP filter string creation
  • tree traversal methods
  • other stuff like attribute en- and decoding

Some parts of the proposal are inspired by the brilliant PEAR:Net_LDAP2 package.

I think LDAP usage should be as easy as querying a database and the current ext/ldap with its three different resource types is predestined for beeing wrapped up in an object oriented interface. Just take the following examples:

ext/ldap

$ds=ldap_connect("localhost");
ldap_bind($ds);
$sr=ldap_search($ds, "o=My Company, c=US", "sn=S*");
$info=ldap_get_entries($ds, $sr); // result is buffered on the machine
for ($i=0; $i<$info["count"]; $i++) {
    echo "dn is: " . $info[$i]["dn"] . "<br />";
    echo "first cn entry is: " . $info[$i]["cn"][0] . "<br />";
    echo "first email entry is: " . $info[$i]["mail"][0] . "<br /><hr />";
}
ldap_free_result($sr);
ldap_close($ds);

Zend_Ldap with ext/ldap

$ldap=new Zend_Ldap(array(/* options */));
$ldap->bind();
$sr=ldap_search($ldap->getResource(), "o=My Company, c=US", "sn=S*");
$info=ldap_get_entries($ldap->getResource(), $sr); // result is buffered on the machine
for ($i=0; $i<$info["count"]; $i++) {
    echo "dn is: " . $info[$i]["dn"] . "<br />";
    echo "first cn entry is: " . $info[$i]["cn"][0] . "<br />";
    echo "first email entry is: " . $info[$i]["mail"][0] . "<br /><hr />";
}
ldap_free_result($sr);
$ldap->disconnect();

Zend_Ldap_Ext

$ldap=new Zend_Ldap_Ext(array(/* options */));
$ldap->bind();
foreach ($ldap->search("sn=S*") as $item) { // items will be fetched from the LDAP when needed
    echo "dn is: " . $item["dn"] . "<br />";
    echo "first cn entry is: " . $item["cn"][0] . "<br />";
    echo "first email entry is: " . $item["mail"][0] . "<br /><hr />";
}
$ldap->disconnect();

This is just a simple example… But think of moving or copying complete subtrees, renaming entries, traversing the tree recursively and so on. Furthermore there are some common pitfalls with escaping values in filter strings and escaping DN values. My aim is to develop Zend_Ldap_Ext so that it addresses the most common workflows when dealing with LDAP servers.

August 11, 2008 at 11:18 2 comments

Hello world!

Welcome to WordPress.com. This is your first post. Edit or delete it and start blogging!

August 10, 2008 at 19:34 Leave a comment

Newer Posts


Twitter

del.icio.us

Certification