wchar_t handling in IDLs

Andrew Bartlett abartlet at samba.org
Fri Oct 11 14:43:21 MDT 2013


On Fri, 2013-10-11 at 08:51 +0300, Alexander Bokovoy wrote:
> Hi!
> 
> I'm looking into some cases where printers are not accessible when
> served by Samba and apparently, common issue is that they are named with
> characters outside Latin-1.
> 
> According to spoolss.idl, device name in spoolss_DeviceMode struct is
> presented as 
> 
> [charset(UTF16),to_null] uint16 devicename[MAXDEVICENAME];
> 
> while that is 'wchar_t devicename[32]' on Windows side.
> 
> Here is what we get for a printer with Russian letters in the name:
> ---------------------------------------------------------------------------------
> [2011/08/26 13:13:07.800902,  9] printing/nt_printing.c:4015(get_a_printer_2)
>   Unpacked printer [office-hp-p1505-prn] name [\\officesrv\HP P1505 у закупщиков] running driver [HP LaserJet P1505]
>   loading DEVICEMODE
> [2011/08/26 13:13:07.800980,  3] lib/charcnv.c:279(convert_string_internal)
>   convert_string_internal: Conversion error: Illegal multibyte sequence(<D1>)
> [2011/08/26 13:13:07.800998,  1] ../librpc/ndr/ndr.c:421(ndr_push_error)
>   ndr_push_error(5): Bad character conversion
> [2011/08/26 13:13:07.801014,  1] ../librpc/ndr/ndr.c:251(ndr_print_function_debug)
>        spoolss_GetPrinter: struct spoolss_GetPrinter
>           out: struct spoolss_GetPrinter
>               info                     : *
>                   info                     : union spoolss_PrinterInfo(case 2)
>                   info2: struct spoolss_PrinterInfo2
>                       servername               : *
>                           servername               : '\\192.168.8.2'
>                       printername              : *
>                           printername              : '\\192.168.8.2\HP P1505 у закупщиков'
>                       sharename                : *
>                           sharename                : 'office-hp-p1505-prn'
>                       portname                 : *
>                           portname                 : 'Samba Printer Port'
>                       drivername               : *
>                           drivername               : 'HP LaserJet P1505'
>                       comment                  : *
>                           comment                  : 'Printer HP P1505 on SERVER in Office'
>                       location                 : *
>                           location                 : 'Office, Server'
>                       devmode                  : *
>                           devmode: struct spoolss_DeviceMode
>                               devicename               : '\\officesrv\HP P1505 у зак<D1>'
>                               specversion              : DMSPEC_NT4_AND_ABOVE (1025)
>                               driverversion            : 0x0701 (1793)
>                               size                     : 0x00dc (220)
>                               __driverextra_length     : 0x04d8 (1240)
>                               fields                   : 0x0780ef03 (125890307)
> .......
> .......
> ------------------------------------------------------------------------------------
> 
> spoolss_DeviceMode.devicename is 32 wchar_t long. The full devicename
> would be 33 wchar_t chars. As we can see, it got cut down, perhaps on
> Windows side. I wonder how should we deal with such broken multi-byte
> strings? Should the devicename cut down to 32 wchar_t using the same
> algorithm everywhere so that regardless how it came, we are capable to
> recognize it and handle without conversion error?

What we need to do is create a new charset, like UTF16_MUNGED, but
without the \0 -> \1 conversion.  (just create two wrapper functions, or
push that mapping up to the string2key caller).  Then declare this as
that type.  We will convert the failed string into the unmappable
character, rather than failing.  I also need this, if we are going to
show 'userParameters' over LDAP as 'utf8' (it isn't).  

Microsoft doesn't seem to fail conversions, they just map.  I don't
think that's always the best programming practice, we used to do that,
but at least if we set up the new charset, we can control in the IDL
where we choose to do it. 

Andrew Bartlett

-- 
Andrew Bartlett                                http://samba.org/~abartlet/
Authentication Developer, Samba Team           http://samba.org




More information about the samba-technical mailing list