All Products
Search
Document Center

DataWorks:Use a PyODPS node to send emails

Last Updated:Nov 22, 2024

This topic describes how to use a PyODPS node that runs on an exclusive resource group to send emails in DataWorks.

Background information

Compared with a Python script, a PyODPS node in DataWorks is used to interact with MaxCompute for data analysis and processing. DataWorks cannot automatically send emails as scheduled. You can create a PyODPS node and run the node on an exclusive resource group to read data from MaxCompute and send emails.

Note

TCP port 25 is the default email service port. For security reasons, TCP port 25 is disabled for Elastic Compute Service (ECS) instances by default. In this case, you cannot use port 25 in exclusive resource groups. We recommend that you use port 465 to send emails.

If a PyODPS node is run on an exclusive resource group to send emails, the users of the exclusive resource group cannot connect to the ECS instance that is used to send emails in the resource group. As a result, the users cannot install more third-party Python modules to implement additional features.

Procedure

  1. Create an exclusive resource group.

    1. Log on to the DataWorks console. In the top navigation bar, select the desired region. In the left-side navigation pane, click Resource Group.

    2. On the Exclusive Resource Groups tab of the page that appears, click Create Resource Group.

    3. Configure the parameters based on your business requirements. For more information, see Create and use a serverless resource group.

      Note

      Make sure that the resource group that you want to create resides in the same region as your DataWorks workspace.

    4. Click Buy Now.

  2. Associate the resource group with a workspace.

    1. Find the created resource group and click Associate Workspace in the Actions column.

    2. In the Associate Workspace panel, find the workspace with which you want to associate the resource group and click Associate in the Actions column.

  3. Go to the DataStudio page.

    Log on to the DataWorks console. In the top navigation bar, select the desired region. In the left-side navigation pane, choose Data Development and Governance > Data Development. On the page that appears, select the desired workspace from the drop-down list and click Go to Data Development.

  4. Create a PyODPS 2 node.

    1. On the DataStudio page, move the pointer over the image.png icon and choose Create Node > MaxCompute > PyODPS 2.

      You can also click the name of the desired workflow in the Business Flow section, right-click MaxCompute, and then chooseCreate Node > PyODPS 2.

    2. In the Create Node dialog box, configure the Name and Path parameters.

      Note

      The node name cannot exceed 128 characters in length and can contain letters, digits, underscores (_), and periods (.).

    3. Click Confirm.

    4. On the configuration tab of the PyODPS 2 node, enter the following code to send emails by using Simple Mail Transfer Protocol (SMTP):

      import smtplib
      from email.mime.text import MIMEText
      from odps import ODPS
      mail_host = '<yourHost>' //The address of the email server.
      mail_username = '<yourUserName>' //The username that is used to log on to the mailbox.
      mail_password = '<yourPassWord>'  //The password that is used to log on to the mailbox.
      mail_sender = '<senderAddress>' //The email address of the sender.
      mail_receivers = ['<receiverAddress>']  //The email address of the recipient.
      mail_content=""        //The email content that you want to send.
      with o.execute_sql('query_sql').open_reader() as reader:
                 for record in reader:
                         mail_content+=str(record['column_name'])+' '+record['column_name']+'\n'
      message = MIMEText(mail_content,'plain','utf-8')
      message['Subject'] = 'mail test'
      message['From'] = mail_sender
      message['To'] = mail_receivers[0]
      try:
                 smtpObj = smtplib.SMTP_SSL(mail_host+':465')
                 smtpObj.login(mail_username,mail_password)
                 smtpObj.sendmail(
                     mail_sender,mail_receivers,message.as_string())
                 smtpObj.quit()
                 print('mail send success')
      except smtplib.SMTPException as e:
                 print('mail send error',e)           

      You can also enter the following code to send emails:

      import smtplib
      from email.mime.text import MIMEText
      from odps import ODPS
      mail_host = 'username@example.com'  //The address of the email server.
      mail_username = 'xxxx' //The username that is used to log on to the mailbox.
      mail_password = 'xxx'  //The password that is used to log on to the mailbox.
      mail_sender = 'xxx' //The email address of the sender.
      mail_receivers = ['xxx']  //The email address of the recipient.
      mail_content=""        //The email content that you want to send.
      with o.execute_sql('query_sql').open_reader() as reader:           
          for record in reader:                   
                      mail_content+=str(record['column_name'])+' '+record['column_name']+'\n'
      message = MIMEText(mail_content,'plain','utf-8')
      message['Subject'] = 'mail test'
      message['From'] = mail_sender
      message['To'] = mail_receivers[0]
      try:           
                 smtpObj = smtplib.SMTP()
                 smtpObj.connect(mail_host,587)
                 smtpObj.ehlo()
                 smtpObj.starttls()
                 smtpObj.login(mail_username,mail_password)
                 smtpObj.sendmail(
                     mail_sender,mail_receivers,message.as_string())
                 smtpObj.quit()
                 print('mail send success')
      except smtplib.SMTPException as e: 
                print('mail send error',e)
      Note

      When you use the PyODPS 2 node to send mails, the PyODPS 2 node stores the data that it reads in a temporary file and then sends the data by email. The number of data records in the email that you want to send is unlimited.

    5. Click the 保存 icon in the top toolbar.

  5. Commit the node.

    Important

    You must configure the Rerun and Parent Nodes parameters on the Properties tab before you commit the node.

    1. Click the 提交 icon in the top toolbar.

    2. In the Submit dialog box, enter information in the Change description field.

    3. Click Confirm.

    If the workspace that you use is in standard mode, you must click Deploy in the upper-right corner to deploy the node after you commit the node. For more information how to deploy a node, see Deploy nodes.

  6. Change the resource group that is used to run the PyODPS 2 node.

    1. In the upper-right corner of the configuration tab of the PyODPS 2 node, click Operation Center.

    2. In the left-side navigation pane of the Operation Center page, choose Auto Triggered Node O&M > Auto Triggered Nodes.

    3. On the page that appears, find the desired node and choose More > Modify Scheduling Resource Group in the Actions column.

    4. In the Modify Scheduling Resource Group dialog box, select a desired resource group from the New Resource Group drop-down list.

    5. Click OK.

  7. Test the node. For more information, see View and manage auto triggered tasks.