Archive: CustomizingLearn Python in 10 Minutes or LessRyan Weisenberger Python is the primary language used throughout Ultraseek. Python is only a few years old, but has quickly become robust and powerful. Having an understanding of Python can help you perform advanced customizations of the user interface, and easily augment Ultraseek’s behavior through Step 1. Download the Python interpreter
Step 2. Run PythonAfter you have installed Python on your system, you should run it according to the Python installation instructions. As Python starts, it will display something like the following:
That For example, enter Step 3. Take the Python tutorialNow that you have a fully functioning Python interpreter on your system, you should take the Python tutorial. This tutorial will guide you through the basics of Python, such as the syntax, flow control, and the use of modules. Step 4. Read a book or twoAt this point, you should have a strong enough grasp of the basic Python programming concepts to go meddle around in the Ultraseek user interface, or maybe even Posted July 08, 2005 by editor Adding a Custom Document FilterRyan Weisenberger While Ultraseek can parse a wide variety of document formats from several different vendors, occasionally an enterprise has a document format that is unknown to Ultraseek. In this case, it is possible to add your own document filter so Ultraseek can parse your custom document type. To add a user-defined document type to Ultraseek, there are three steps you must follow.
Step 1: Edit your
|
| parameter | type | description |
|---|---|---|
| doctype | string | This is the name of your document type. If you look on the admin console under Server > Doctypes, in the section titled Document Type Parsing, the menus in the Parse as column will have your additional doctype. |
| convertername | string | This is the name of your program or script that will convert your user-defined type to html. |
| errdict | dictionary | This is a Python dictionary that converts the integer exit codes of your conversion script/program to a string error message. |
For example, here is how you would add support for a rot13 file, a file type where each character is rotated through the alphabet by 13 positions. (i.e., “abc” becomes “nop”)
Add the following lines to the patches.py file:
import parse, config
errdict = {
1: "script error - you used the wrong exit code!",
2: "some other error message"
}
# Look for the script in the Ultraseek lib directory;
convertername = config.program_lib_path("unrot13")
parse.define_doctype_filter("rot13", convertername, errdict)
After making these additions, restart Ultraseek.
The code you added to patches.py will be executed and your new document type will appear in the Parse as menu in the Document Type Parsing section under Server > Doctypes.
Step 2: Configure Ultraseek to use your document filter
In the Ultraseek admin console, go to the Server > Doctypes pane.
Add the following new extension to the Document Type Specification:
.rot13 application/rot13
Add the following new document type under Document Type Parsing:
Document type: application/rot13
Parse As: rot13
Click the OK button to save your new settings.
Note: Your Web server must serve the correct MIME type for this document for Ultraseek to use the custom filter. Make sure the document type and MIME type are registered with your Web server software.
Step 3: Install your conversion program
Your conversion program will be run by Ultraseek as follows:
converter inputfile outputfile errorfile
Your converter program must read the contents of inputfile, convert it to html, and write the results to outputfile. Error messages may be written to errorfile.
If the conversion works normally, it should exit with an exit code of 0. If the converter fails for some reason, it should exit with a non-zero exit code.
If the exit code is non-zero, an error message will be logged that is a combination of the error message looked up in the error dictionary passed in to define_doctype_filter and any output that is written to errorfile.
For example, you can use the following shell script to
convert rot13 files back into normal text:
#!/bin/sh
# $1 is the input file
# $2 is the output html file that will be indexed
# $3 is the file where we place any error messages that
# we may generate;
echo "<html><body>" > $2
tr n-za-mN-ZA-M a-zA-Z < $1 >> $2
echo "</body></html>" >> $2
# If we do this 'exit 1', we will see the
# 'script error' message
# from our sample error dictionary in the log file;
# exit 1
exit 0
This code should be placed in the Ultraseek /lib directory in a file named unrot13 to be used in the example above.
If you are using Windows, the conversion program must be a .exe file and not a script.
Posted June 28, 2005 by editor
Categories
Archives
Recent Entries
Learn Python in 10 Minutes or Less
Adding a Custom Document Filter
Related Forum
How to make more "Related Topics" show in results page?
Pull entire keyword list not only top queries
Filter out searches from internal employees