Concept of object localization in ArmarX

Object localization is handled automatically by the working memory of MemoryX. When the robot needs to know the location of an object, it is requested in the working memory, which will from then on automatically call the localizer whenever it is necessary and the object is within the field of view. Instructions how to request objects and retrieve their location are here: memoryx-howto-retrieve-objects

For this to work, any object recognizer must be compatible to MemoryX. Also, the prior knowledge or longterm memory need to know that an object can be localized using this recognizer. On the other hand, the memory can contain information about the object that the recognizer needs.

In this tutorial, you will learn how to implement your recognizer and integrate it into this framework.

Creating your object recognizer component

First, you need to create a component following the instructions here: ArmarXCore-Tutorials-Implementing-a-component. Let's call it "MyNewRecognizer". It should be placed in a subdirectory of VisionX/source/VisionX/components/object_perception/.

In the MyNewRecognizer.h file, include the interface definition file:

#include <VisionX/interface/components/ObjectLocalizerInterfaces.h>

The interfaces in this file allow MemoryX to call your recognizer.

Above the actual class declaration, you finde the definition of the properties of your component. They usually include the names of proxies that you need, e.g. the image provider, the name of the coordinate system in which the camera is located, and parameters of your algorithm.

class MyNewRecognizerPropertyDefinitions:
    public armarx::ComponentPropertyDefinitions
{
public:
    MyNewRecognizerPropertyDefinitions(std::string prefix):
        armarx::ComponentPropertyDefinitions(prefix)
    {
        defineOptionalProperty<std::string>("PriorKnowledgeProxyName", "PriorKnowledge", "name of prior memory proxy");
        defineOptionalProperty<std::string>("DataBaseObjectCollectionName", "memdb.Prior_KitchenKKObjects", "name of collection from database to use for object classes");
        defineOptionalProperty<std::string>("ImageProviderName", "Armar3ImageProvider", "name of the image provider to use");
        defineOptionalProperty<std::string>("PointCloudProviderName", "PointCloudProvider", "name of the pointcloud provider to use");
        defineOptionalProperty<std::string>("ImageReferenceFrameName", "EyeLeftCamera", "Sets the reference frame name of the pose provided by this recognizer. Must be a frame name known in ArmarPose from the robot model.");
        defineOptionalProperty<std::string>("AgentName", "Armar3", "Name of the agent for which the sensor values are provided.");
        defineOptionalProperty<std::string>("RobotStateProxyName", "RobotStateComponent", "Ice Adapter name of the robot state proxy.");
        defineOptionalProperty<float>("MyFloatParameter", 4.2, "A parameter");
        defineOptionalProperty<bool>("MyBoolParameter", true, "Another parameter");
        defineOptionalProperty<int>("MyIntegerParameter", 42, "Yet another parameter");
    }
};

Depending on whether you use images, pointclouds or both for recognition, you have to inherit from the ObjectLocalizerImageInterface, the ObjectLocalizerPointCloudInterface, or the ObjectLocalizerPointCloudAndImageInterface. Correspondingly, you also have to inherit from ImageProcessor, PointCloudProcessor, or PointCloudAndImageProcessor. Let's assume you will use both images and pointclouds, then your class declaration starts like this:

class MyNewRecognizer:
        virtual public ObjectLocalizerPointCloudAndImageInterface,
        virtual public PointCloudAndImageProcessor
{
    ...
}

You need to implement at least these functions:

void onInitPointCloudAndImageProcessor();
void onConnectPointCloudAndImageProcessor();
void onExitPointCloudAndImageProcessor();
memoryx::ObjectLocalizationResultList localizeObjectClasses(const memoryx::ObjectClassNameList& objectClassNames, const Ice::Current& c = Ice::Current());

You should also create some member variables that we will need later:

ImageProviderInterfacePrx imageProviderProxy;
PointCloudProviderInterfacePrx pointcloudProviderProxy;
CByteImage** cameraImages;

In the MyNewRecognizer.cpp file, you need to implement several methods. In the onInit...() and onConnect...() method, you initialize your algorithm and connect to all the proxies you need:

void MyNewRecognizer::onInitPointCloudAndImageProcessor()
{
    usingImageProvider(getProperty<std::string>("ImageProviderName").getValue());
    usingPointCloudProvider(getProperty<std::string>("PointCloudProviderName").getValue());
 
    usingProxy(getProperty<std::string>("PriorKnowledgeProxyName").getValue());
 
    ...
}
 
void MyNewRecognizer::onConnectPointCloudAndImageProcessor()
{
    getImageProvider(getProperty<std::string>("ImageProviderName").getValue());
    getPointCloudProvider(getProperty<std::string>("PointCloudProviderName").getValue());
 
    imageProviderProxy = getProxy<ImageProviderInterfacePrx>(getProperty<std::string>("ImageProviderName").getValue());
    pointcloudProviderProxy = getProxy<PointCloudProviderInterfacePrx>(getProperty<std::string>("PointCloudProviderName").getValue());
 
    cameraImages = new CByteImage*[imageProviderProxy->getNumberImages()];
    for (int i = 0; i < imageProviderProxy->getNumberImages(); i++)
    {
        currentCameraImages[i] = tools::createByteImage(imageProviderProxy->getImageFormat(), imageProviderProxy->getImageFormat().type);
    }
 
    ...
}
 
void MyNewRecognizer::onExitPointCloudAndImageProcessor()
{
    ...
}

The localizeObjectClasses(...) method will be called automatically by the working memory and executes the actual object localization. Here, you get the current camera images and/or pointcloud, and execute your localization algorithm:

ObjectLocalizationResultList MyNewRecognizer::localizeObjectClasses(const memoryx::ObjectClassNameList& objectClassNames, const Ice::Current& c)
{
    ObjectLocalizationResultList resultList;
 
    // wait for images and pointcloud to be available
    waitForImages(1000); // timeout 1000 ms
    waitForPointClouds(1000);
 
 
    // get new camera images
    if (!getImages(cameraImages))
    {
        ARMARX_WARNING << "Unable to get camera images";
        return resultList;
    }
 
    pcl::PointCloud<pcl::PointXYZRGBA>::Ptr pointcloud(new pcl::PointCloud<pcl::PointXYZRGBA>());
    if (!getPointClouds(pointcloud))
    {
        ARMARX_WARNING << "Unable to get pointcloud";
        return resultList;
    }
 
 
    for (size_t i = 0; i < objectClassNames.size(); i++)
    {
        std::string objectClassName = objectClassNames.at(i);
 
        // execute your algorithm
        ...

When you can't find the object, you just return the empty result list. If you find one or more instances of the object, assemble a localization result for each instance, and add them to the result list:

        Eigen::Vector3f objectPosition = ...;
        Eigen::Matrix3f objectOrientation = ...;
 
        memoryx::ObjectLocalizationResult result;
        result.objectClassName = objectClassName;
 
        std::string referenceFrame = getProperty<std::string>("ReferenceFrameName").getValue();
        std::string agentName = getProperty<std::string>("AgentName").getValue();
 
        result.position = new armarx::FramedPosition(objectPosition, referenceFrame, agentName);
        result.orientation = new armarx::FramedOrientation(objectOrientation, referenceFrame, agentName);
 
        // estimate localization uncertainty
        FloatVector mean = {0, 0, 0};
        FloatVector vars = {10000, 10000, 10000}; // variance of the normal distribution quantifying the estimated position uncertainty of the localization (here: 100mm^2)
        result.positionNoise = memoryx::MultivariateNormalDistributionPtr(new memoryx::MultivariateNormalDistribution(mean, vars));
 
        // estimate recognition certainty
        result.recognitionCertainty = 0.9;
 
        resultList.push_back(result);
    }
 
    return resultList;
}

Integrating your object recognizer with MemoryX

Information about object classes that are known a priori is stored in the prior knowledge of MemoryX. If the robot learned about an object on its own, the information is stored in the longterm memory. In practice, this makes no difference for your object recognizer.

Creating a wrapper for recognition-relevant object information

If you want to store necessary information for recognition with the object, e.g. a file containing some kind of descriptor, you should create a wrapper for easy access to that information. Add your wrapper to the MemoryX/source/MemoryX/libraries/helpers/ObjectRecognitionHelpers/ObjectRecognitionWrapper.h/cpp files.

Let's assume that the information your recognizer needs are a descriptor file and a float value. Then your wrapper should look like this:

In ObjectRecognitionWrapper.h:

class MyNewRecognizerWrapper : public AbstractFileEntityWrapper
        {
        public:
            MyNewRecognizerWrapper(const GridFileManagerPtr& gfm);
 
            std::string getDescriptorFileName() const;
            void setDescriptorFileName(const std::string& fileName, const std::string& filesDBName);
 
            float getFloatParameter();
            void setFloatParameter(const float myFloatParameter);
 
            Ice::ObjectPtr ice_clone() const;
        };
 
        using MyNewRecognizerWrapperPtr = IceInternal::Handle<MyNewRecognizerWrapper>;

In ObjectRecognitionWrapper.cpp:

MyNewRecognizerWrapper::MyNewRecognizerWrapper(const GridFileManagerPtr& gfm):
    AbstractFileEntityWrapper(gfm)
{
}
 
 
std::string MyNewRecognizerWrapper::getDescriptorFileName() const
{
    if (entity->hasAttribute(POINTCLOUD_FILE))
    {
        const std::string fileName = cacheAttributeFile("descriptorFileName", true);
        return fileName;
    }
 
    return "";
}
 
 
void MyNewRecognizerWrapper::setDescriptorFileName(const std::string& fileName, const std::string& filesDBName)
{
    if (fileName != "")
    {
        EntityAttributeBasePtr fileAttr = new EntityAttribute("descriptorFileName");
        fileManager->storeFileToAttr(filesDBName, fileName, fileAttr);
        cleanUpAttributeFiles(entity->getAttribute("descriptorFileName"), fileAttr);
        entity->putAttribute(fileAttr);
    }
}
 
 
float MyNewRecognizerWrapper::getFloatParameter()
{
    if (entity->hasAttribute("myFloatParameter"))
    {
        EntityPtr p = EntityPtr::dynamicCast(entity);
        return p->getAttributeValue("myFloatParameter")->getFloat();
    }
    else
    {
        ARMARX_WARNING_S << "Attribute " << "myFloatParameter" << " not set for object " << entity->getName();
        return 0;
    }
}
 
 
void MyNewRecognizerWrapper::setFloatParameter(const float myFloatParameter)
{
    EntityAttributePtr myFloatParameterAttr = new EntityAttribute("myFloatParameter");
    myFloatParameterAttr->setValue(new Variant(myFloatParameter));
    entity->putAttribute(myFloatParameterAttr);
}
 
 
Ice::ObjectPtr MyNewRecognizerWrapper::ice_clone() const
{
    return new MyNewRecognizerWrapper(*this);
}

Accessing the object information

You can set and read the information that is stored with the object in the memory. In the onConnect...() method of your recognizer, you probably want to access e.g. the descriptor file of an object that you need to be able to recognize it. Assuming that the information about the object is kept in the PriorKnowledge, this can be done with the following code, that you should add to your onConnect...() method:

// get proxies to memory
PriorKnowledgeInterfacePrx priorKnowledgeProxy = getProxy<PriorKnowledgeInterfacePrx>(getProperty<std::string>("PriorKnowledgeProxyName").getValue());
PersistentObjectClassSegmentBasePrx classesSegmentProxy = priorKnowledgeProxy->getObjectClassesSegment();
CommonStorageInterfacePrx databaseProxy = priorKnowledgeProxy->getCommonStorage();
 
// get the file manager of the database
GridFileManagerPtr fileManager;
fileManager.reset(new GridFileManager(databaseProxy));
 
// set the database collection that contains our object informations
CollectionInterfacePrx coll = databaseProxy->requestCollection(getProperty<std::string>("DataBaseObjectCollectionName").getValue());
classesSegmentProxy->addReadCollection(coll);
 
// find all object classes that use our recognizer type
EntityIdList idList = classesSegmentProxy->getAllEntityIds();
for (EntityIdList::iterator iter = idList.begin(); iter != idList.end(); iter++)
{
    EntityPtr entity = EntityPtr::dynamicCast(classesSegmentProxy->getEntityById(*iter));
 
    if (entity)
    {
        ObjectRecognitionWrapperPtr recognitionWrapper = entity->addWrapper(new ObjectRecognitionWrapper());
 
        // if this object is localizable by our recognizer...
        if (recognitionWrapper->getRecognitionMethod() == getName())
        {
            std::string className = entity->getName();
            ARMARX_INFO << "Adding class " << className << " to " << getDefaultName();
 
            // get information needed by the localizer
            MyNewRecognizerWrapperPtr myNewRecognizerWrapper = objectClassEntity->addWrapper(new MyNewRecognizerWrapperPtr(fileManager));
            float myFloatParameter = myNewRecognizerWrapper->getFloatParameter();
            std::string descriptorFileName = myNewRecognizerWrapper->getDescriptorFileName();
            // now you can load the descriptor file: the filename points to where the database has written a copy of the file in its cache directory on the disk
            ...
        }
    }
}

Writing object information to the memory

If you want to add an object to the prior knowledge, get the proxies as above, and then:

CollectionInterfacePrx coll = databaseProxy->requestCollection(getProperty<std::string>("DataBaseObjectCollectionName").getValue());
classesSegmentProxy->setWriteCollection(coll);
 
ObjectClassPtr newObjectClass = ObjectClassPtr(new ObjectClass());
newObjectClass->setName("objectName");
 
ObjectRecognitionWrapperPtr recognitionWrapper = newObjectClass->addWrapper(new ObjectRecognitionWrapper());
recognitionWrapper->setRecognitionMethod("MyNewRecognizer");
recognitionWrapper->setDefaultMotionModel("Static");
 
MyNewRecognizerWrapperPtr myNewRecognizerWrapper = newObjectClass->addWrapper(new MyNewRecognizerWrapperPtr(fileManager));
 
myNewRecognizerWrapper->setFloatParameter(4.2);
 
// get name of the filesDB, add the descriptor file to the DB
std::string collectionName = classesSegmentPrx->getWriteCollectionNS();
size_t dotPosition = ns.find_first_of('.');
if (dotPosition != std::string::npos)
{
    std::string filesDBName = ns.substr(0, dotPosition);
    myNewRecognizerWrapper->setDescriptorFileName("filename", filesDBName);
}
 
classesSegmentProxy->addEntity(newObjectClass);

Adapting the PriorMemoryEditor to allow entering information for your recognizer

The PriorMemoryEditor allows to manually add objects to the prior knowledge and enter information about them e.g. for recognition or grasping. To use it, you must have started ICE and MongoDB, as well as the CommonStorage and PriorMemory component, e.g. by starting the Armar3Simulation scenario. Then you find the PriorMemoryEditor in the "Memory" category of the ArmarXGui.

To add a new object to the prior knowledge, click on "Add":

The PriorMemoryEditor

In the dialog that has opened now, you can enter the name of the object and set the recognition method as well as parameters for it. If your new recognizer doesn't need any parameters, just select "Other" as the recognition method and enter the name of your recognizer in the field at the bottom.

The object class edit dialog

If you want to store information with the object that is needed by your recognizer, you have to extend this dialog. The code and gui files for it are in MemoryX/gui-plugins/PriorMemoryEditor.

First, in the file PriorMemoryEditorPlugin.cpp, you have to add a wrapper for your class in two places:

void PriorEditorController::doEditClass(bool isNew)
{
    ...
    objectClass->addWrapper(new MyNewRecognizerWrapper(fileManager));
    ...
}
 
void PriorEditorController::updateObject(const memoryx::ObjectClassPtr objClass, bool force)
{
    ...
    objClass->addWrapper(new MyNewRecognizerWrapper(fileManager));
    ...
}

The gui is defined in the file ObjectClassEditDialog/RecognitionAttributesEditTab.ui. It can be edited with the QtCreator. Here you have to add your Recognizer to the dropdown menu list, and add input fields where the parameters can be entered.

In the ObjectClassEditDialog/RecognitionAttributesEditTab.h file, add a member variable for the name of your recognizer:

const std::string myNewRecognizer;

In the ObjectClassEditDialog/RecognitionAttributesEditTab.cpp file, set the value of that variable to the name that you entered in the dropdown menu of the gui. This happens in the constructor:

RecognitionAttributesEditTab::RecognitionAttributesEditTab(QWidget* parent)
    : EntityAttributesEditTab(parent),
      ...
      myNewRecognizer("MyNewRecognizer"),
      otherRecognitionMethod("Other")
{
    ...
}

In the method updateGui(const EntityPtr& entity), the values stored with the object are written to the input elements that you added to the gui. In updateEntity(const EntityPtr& entity, std::string filesDBName), the values that were entered in the gui are stored to the object. You can do that analogously to the cases that are already implemented there. Note that the input fields that you created in the gui are member variables of "ui". In the method recognitionMethodChanged(const QString& method), set the activation status of your input fiels as for the existing recognizers.