EnglishChinese

Archive: Customizing

Learn Python in 10 Minutes or Less

Ryan Weisenberger
Manager, Software Development

Python is the primary language used throughout Ultraseek. Python is only a few years old, but has quickly become robust and powerful. Having an understanding of Python can help you perform advanced customizations of the user interface, and easily augment Ultraseek’s behavior through patches.py. Anyone with experience in programming languages can pick up the basics of Python in just a few short minutes.

Step 1. Download the Python interpreter

  1. Go to the Python website.
  2. Click the “Download” link.
  3. Click on the latest version of the Python interpreter. As of June 2005, this is Python 2.4.1.
  4. Download the binary package of Python for your operating system.
  5. Install Python on your system according to the Python installation instructions.

Step 2. Run Python

After you have installed Python on your system, you should run it according to the Python installation instructions. As Python starts, it will display something like the following:

Python 2.4 (#60, Nov 30 2004, 11:49:19)
[MSC v.1310 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license"
for more information.
>>>

That >>> is the prompt for the Python command-line interpreter. You can enter Python expressions directly into the interpreter, and immediately see their evaluation.

For example, enter 2+2, and the interpreter will return 4.

Step 3. Take the Python tutorial

Now that you have a fully functioning Python interpreter on your system, you should take the Python tutorial.

This tutorial will guide you through the basics of Python, such as the syntax, flow control, and the use of modules.

Step 4. Read a book or two

At this point, you should have a strong enough grasp of the basic Python programming concepts to go meddle around in the Ultraseek user interface, or maybe even patches.py. But if you really want to delve into Python, you should check out some of the Python books available. A comprehensive list can be found at the Python wiki.

Posted July 08, 2005 by editor

Adding a Custom Document Filter

Ryan Weisenberger
Manager Software Development

While Ultraseek can parse a wide variety of document formats from several different vendors, occasionally an enterprise has a document format that is unknown to Ultraseek. In this case, it is possible to add your own document filter so Ultraseek can parse your custom document type.

To add a user-defined document type to Ultraseek, there are three steps you must follow.

  1. You must add instructions to your patches.py file to tell Ultraseek the name of your document type and the name of a program that can convert it to html.
  2. You must go to your Ultraseek admin screen and configure the file extension and the mime type for your document type.
  3. You must install a conversion program that can convert your document type into html.

Step 1: Edit your patches.py file

To tell Ultraseek about your document type, you call the function parse.define_doctype_filter. It is called as follows:

parse.define_doctype_filter(doctype, convertername, errdict)
parameter type description
doctype string This is the name of your document type. If you look on the admin console under Server > Doctypes, in the section titled Document Type Parsing, the menus in the Parse as column will have your additional doctype.
convertername string This is the name of your program or script that will convert your user-defined type to html.
errdict dictionary This is a Python dictionary that converts the integer exit codes of your conversion script/program to a string error message.

For example, here is how you would add support for a rot13 file, a file type where each character is rotated through the alphabet by 13 positions. (i.e., “abc” becomes “nop”)

Add the following lines to the patches.py file:

import parse, config

errdict = {
    1: "script error - you used the wrong exit code!",
    2: "some other error message" 
    }

# Look for the script in the Ultraseek lib directory;
convertername = config.program_lib_path("unrot13")

parse.define_doctype_filter("rot13", convertername, errdict)

After making these additions, restart Ultraseek.

The code you added to patches.py will be executed and your new document type will appear in the Parse as menu in the Document Type Parsing section under Server > Doctypes.

Step 2: Configure Ultraseek to use your document filter

In the Ultraseek admin console, go to the Server > Doctypes pane.

Add the following new extension to the Document Type Specification:

.rot13 application/rot13

Add the following new document type under Document Type Parsing:

Document type: application/rot13
Parse As: rot13

Click the OK button to save your new settings.

Note: Your Web server must serve the correct MIME type for this document for Ultraseek to use the custom filter. Make sure the document type and MIME type are registered with your Web server software.

Step 3: Install your conversion program

Your conversion program will be run by Ultraseek as follows:

converter inputfile outputfile errorfile

Your converter program must read the contents of inputfile, convert it to html, and write the results to outputfile. Error messages may be written to errorfile.

If the conversion works normally, it should exit with an exit code of 0. If the converter fails for some reason, it should exit with a non-zero exit code.

If the exit code is non-zero, an error message will be logged that is a combination of the error message looked up in the error dictionary passed in to define_doctype_filter and any output that is written to errorfile.

For example, you can use the following shell script to
convert rot13 files back into normal text:

#!/bin/sh

# $1 is the input file
# $2 is the output html file that will be indexed
# $3 is the file where we place any error messages that
#    we may generate;

echo "<html><body>" > $2

tr n-za-mN-ZA-M a-zA-Z < $1 >> $2

echo "</body></html>" >> $2

# If we do this 'exit 1', we will see the 
# 'script error' message
# from our sample error dictionary in the log file;
# exit 1

exit 0

This code should be placed in the Ultraseek /lib directory in a file named unrot13 to be used in the example above.

If you are using Windows, the conversion program must be a .exe file and not a script.

Posted June 28, 2005 by editor

Categories

Customizing

Indexing

Searching

Usability

User Stories

Archives

January 2006

November 2005

October 2005

September 2005

August 2005

July 2005

Recent Entries

Learn Python in 10 Minutes or Less

Adding a Custom Document Filter

Related Forum

Topic rules

How to make more "Related Topics" show in results page?

Integrate Verity with DB

Pull entire keyword list not only top queries

Filter out searches from internal employees

Resources

DOWNLOAD ULTRASEEK NOW!

XML   RSS Feed