Function / API design when porting Perl to Python code

In Perl, there are two functions that users interact in shell

perl train-truecaser.perl --model outputmodelfile --corpus inputfile1


perl truecaser.perl --model outputmodelfile < inputfile2 > output file2

In Python, if both functions are to be an object, it is a bit difficult to conceive how the user should interact with the function with and without shell interaction.

Assuming there are two methods in the Truecaser () object in Python

  • train() what does it do train truecaser.perl make
  • truecase () what does it do truecaser.perl make

We could just follow the perl style functions:

# Initialize an object.
tc = Truecaser ()

# Train the model.
tc.train (inputfile1, save_to = outputmodelfile)

# Use the Truecase model of outputmodelfile
outputfile2 = tc.truecase (inputfile2, model = outputmodelfile)

In this case, the train(...) does not return anything.
And the real thing is the output file2

But what if we make it less functional and connect the output of the models without leaving it in a file, eg.

# Initializes the object.
tc = Truecaser ()

# Always allow output of the output model
model = tc.train (inputfile1, save_to = outputmodelfile)

# Apply the real case
outputfile2 = tc.truecase (inputfile2, model = model)

In this case, a possible template must be loaded as a specific object instead of a file or in addition to loading a file.

We can also implicitly load / replace the model when train() is called, for example

# Initializes the object.
tc = Truecaser ()

# Always allow output of the output model
# But the output model is saved in the tc.model variable.
tc.train (inputfile1, save_to = outputmodelfile)

# Implicitly calls tc.model backups from the previous tc.train ()
outputfile2 = tc.truecase (inputfile2)

I'm not sure what proposal for the Python interface is more Pythonic or more intuitive for users. Esp. when the Perl interface focused on shell interactions while the purpose of the Python interface is to encapsulate both train() and truecase () methods in a single object.

What is the preferred interface for the object + Python functions?

Are there examples of similar functions in other Python libraries?