Sharp PA-W1400/PA-W1410 Data Recovery



Background

My father bought a Sharp PA-W1400 'word-processor' a long long time ago and it served him well. Recently he made it into the would of laptops, with a basic machine running Linux. Alas, he had a lot of documents on his Sharp PA-W1400 without a print out, and had assumed that having two copies on different floppy disks would allow him safe storage of his data.

But when the day came when he needed something off a disk the floppy disk drive would no longer detect the presence of a disk, and so nothing could be read. At that point I got a call to ask and of course the sorry state of affairs came out, not only was there problems reading the floppy disks (due to an old and not-quite-complete FAT) but converting the files into some readable text was not trivial.

So far Sharp have been unable to help. There are companies such as LuxSoft who offer help with some old word-processors, but alas not the very old PA-W1400 model at that time. As a result I ended up doing my own conversion and this page is up to help anyone else in that position.

It is possible this conversion software will work with other Sharp models such as the PA3100 series (PA3140 etc), but I don't know as I have no example documents for them. In fact, they may not even have a floppy drive to save files to!

Reading the Floppy Disk

The word-processor implements a basic version of the FAT-12 file system for 720kB floppy disks only, but was never updated and its implementation appears to date from around 1984-88. My experience was most 'modern' (as in post-1995) computers that still have a floppy disk drive usually see the disk as unformatted and don't do anything useful. This is not unusual and Microsoft has a note about one potential cause here: http://support.microsoft.com/kb/140060/en-us

    Whatever you do, DO NOT FORMAT THE FLOPPY DISK! You will destroy your data that way!

There are 2-3 options that I could think of, but due to my own circumstances I went for the more complicated method of getting the files off the 720k floppy disks. First I found an older Linux PC that still had a working 3.5" floppy disk drive and then made an image of the disk by using the 'dd' program to make a sector by sector copy:

    dd if=/dev/fd0 of=disk.img bs=512 count=1440 conv=noerror

Take particular care to get 'if' and 'of' the right way round, or you will also discover why 'dd' is nick-named 'destroy data'!

Then I used a 'virtual machine' that was running MS-DOS 6.22 and told the VMware player to connect a copy of the resulting disk.img file as the floppy disk. I then used chkdsk to check/fix the file system, and then xcopy to copy the files off the A: floppy disk and eventually on to my PC.

NOTE: The old chkdsk program seems to be fine with the Sharp PA-W1400 file system, but the newer Scandisk program just broke things. Also using the DOS copy command seemed to cause problems, sometimes severe, so I would strongly recommend you use the 'xcopy' program to transfer the files to the DOS computer's C: hard disk.

The 2nd simpler version is you find an old DOS computer with working floppy disk drive and use it to do the copying.

The 3rd, even simpler, option is you pay a company such as LuxSoft to do the work for you!

Warning: Always check the floppy disk drive is healthy by testing it out on a blank or unimportant disk before you use it! I have seen faulty floppy drives scratch and thus destroy data on a disk before.

Understanding the File Format

The results of copying is typically a set of small files with a .doc extension. However, they are nothing like Word or similar file formats, and they appear to be very specific to the PA-W1400 series of word-processors.

I tried to reverse-engineer the file format and had some success, at least enough to keep my father happy.

But I know there are aspects that I do not understand, and some symbols and accented characters that my father did not use, and so I have no idea of how they are implemented.

The PA-W1400 has only two types of 'formatting' bold and underlined. I ignored bold formatting completely, and implemented the 'underlined format' only for the space, as that has an ASCII equivalent '_' symbol. It is likely that super-script & sub-script were options, but again I have no examples of that.

It supported fractions such as ¼, ½, and ¾ which I think I have implemented correctly. It also supported the addition of umlaut (such as ü), acute (á) and grave (è) accents, and probably more such as the circumflex and tilde, but those were the only ones I had examples of.

Also I think it supported the degree symbol, and the cedilla (as in Français), but I have no examples to know how to convert them.

Converting the Files

First you need to download the conversion software source code (MD5 sum is cfba1f06236334aca6d5887a0bc58be2) and compile that. After unzipping the file, from the Linux command prompt you compile with:

    gcc -Wall convert.c -o conv

The small program 'conv' can then be used to convert your extracted files. Please note it will simply overwrite the output file, so I strongly suggest you make all of your extracted .doc files read-only first:

    chmod a-w *.doc

Then do the conversion with a command such as:

   ./conv example.doc example.txt

The resulting text file uses UTF-8 coding for non-ASCII characters and should be easy to read using a decent text editor or to import the files in to a modern word-processor such as the free OpenOffice / LibreOffice suite.

Repeated Disclaimer

I am sure you know the drill about 'free' software, and I just want to make it clear once again that it comes with No Warranty At All. As you can see above, and by looking at the source code before compiling it (you did not run an unknown bit of software without checking, did you?) I simply don't have enough information to do a decent job.

If you can help with more information and/or example files, please let me know.

Useful Links

http://en.wikipedia.org/wiki/List_of_XML_and_HTML_character_entity_references




Contact

Any comments or feedback please make to psc_AT_sat_DOT_dundee_DOT_ac_DOT_uk which I trust you can work out, otherwise I'm afraid to have tell you that your grades are insufficient to pass the Turing Test...

(c) Paul Crawford, 25th July 2011